Don't parse HTML with regex, instead, use a proper HTML parser.
### theory :
According to the compiling theory, HTML can't be parsed using regex based on finite state machine. Due to hierarchical construction of HTML you need to use a pushdown automaton and manipulate LALR grammar using tool like YACC.
### realLife©®™ everyday tool :
instead, you should using a correct tool for a correct job.
...and it's a job for xmllint :
by _string matching_ :
string="Sorcery"
xmllint --html --xpath "//p[contains(text(), '$string')]/text()" file_or_URL
by the Nth `
` node where N is 1 here :
xmllint --html --xpath "//p[1]/text()" file_or_URL
Check <