You say you are using AWK on a daily basis, including web scraping. Most programmers use Python libraries Scrapy and BeautifulSoup, and R libraries like rvest and RCurl. How do you parse HTML with AWK - as part of a pipeline including wget, hxselect, and lynx, or just by using the AWK regular expressions? I couldn't find many examples, except for some basic Rosetta code script and random blog posts. Can you share some example script?