I'm trying to parse contents of an HTML file to scrape a download directory, however I've modified it to a MWE that reproduces my issue:
sed -e 's|\(href\)|\1|' index.html
Prints the entirety of index.html. I was originally thinking that it was an issue with my expression, but this very basic expression proves that wrong.
The same happens if I remove -e
or if I add g
at the end.
It's been a while since I've done sed, am I doing something wrong here? Is sed getting confused with the characters in an html file?