remove specific string from text file using sed

Issue

I have a file called a.txt that have,

time="2022-08-02T15:07:53+05:30" level=info msg="\x1b[32m\x1b[1mPUBLIC\x1b[39m\x1b[0m http://some.s3-ap-southeast-2.amazonaws.com/ (\x1b[33mhttp://some.com\x1b[39m)"
time="2022-08-02T15:07:53+05:30" level=info msg="\x1b[31m\x1b[1mFORBIDDEN\x1b[39m\x1b[0m http://some.s3.amazonaws.com (\x1b[33mhttp://some.com\x1b[39m)"
time="2022-08-02T15:07:54+05:30" level=info msg="\x1b[31m\x1b[1mFORBIDDEN\x1b[39m\x1b[0m http://some.s3.amazonaws.com (\x1b[33mhttp://some.com\x1b[39m)"
time="2022-08-02T15:07:58+05:30" level=info msg="\x1b[31m\x1b[1mFORBIDDEN\x1b[39m\x1b[0m http://some-assets.s3.amazonaws.com (\x1b[33mhttp://some.com\x1b[39m)"
time="2022-08-02T15:08:01+05:30" level=info msg="\x1b[31m\x1b[1mFORBIDDEN\x1b[39m\x1b[0m http://some.s3.amazonaws.com (\x1b[33mhttp://some.com\x1b[39m)"

I want this output

PUBLIC    http://some.s3-ap-southeast-2.amazonaws.com
FORBIDDEN http://some.s3.amazonaws.com
FORBIDDEN http://some.s3.amazonaws.com
FORBIDDEN http://some-assets.s3.amazonaws.com
FORBIDDEN http://some.s3.amazonaws.com

I tried this

cat a.txt | cut -d "=" -f4- | cut -d "[" -f3- | cut -d "m" -f2- |  awk -F '\\.amazonaws.com' '{print $1".amazonaws.com"}'

This is working but, I’m not able to remove \x1b[39m\x1b[0m

Solution

Using sed

$ sed -E 's~([^[]*\[){2}[^A-Z]*([^\]*)[^ ]* ([^ ]*\.[a-z]+).*~\2 \3~' input_file | column -t
PUBLIC     http://some.s3-ap-southeast-2.amazonaws.com
FORBIDDEN  http://some.s3.amazonaws.com
FORBIDDEN  http://some.s3.amazonaws.com
FORBIDDEN  http://some-assets.s3.amazonaws.com
FORBIDDEN  http://some.s3.amazonaws.com

Answered By – HatLess

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published