BaumGeist@lemmy.ml to Programmer Humor@lemmy.ml · 3 months agoThe best answer on StackOverflow: Using RegEx to parse HTMLstackoverflow.comexternal-linkmessage-square36fedilinkarrow-up1312arrow-down113
arrow-up1299arrow-down1external-linkThe best answer on StackOverflow: Using RegEx to parse HTMLstackoverflow.comBaumGeist@lemmy.ml to Programmer Humor@lemmy.ml · 3 months agomessage-square36fedilink
minus-squaremoriquende@lemmy.worldlinkfedilinkarrow-up8arrow-down1·3 months agoIt can’t be done, as an opening tag in html can contain anything in its attributes, even JavaScript (e.g. onclick handler).
minus-squareschnurrito@discuss.tchncs.delinkfedilinkarrow-up1arrow-down2·3 months ago??? Non sequitur
minus-squaremoriquende@lemmy.worldlinkfedilinkarrow-up5·3 months agoYou can’t parse every html opening tag with regex, because a html opening tag doesn’t have a set structure. How would you match, with regex, this opening tag? <mytag myattribute="<value of \"myattribute\">" >
minus-squareschnurrito@discuss.tchncs.delinkfedilinkarrow-up1arrow-down1·edit-23 months agoIs this valid HTML? My understanding is that that attribute value needs to be escaped, i.e. <value of \"myattribute\">.
minus-squaremoriquende@lemmy.worldlinkfedilinkarrow-up4·3 months agoThe quote must not be escaped when you start with a single quote. The rest doesn’t. This is valid and tested: <img alt='my "<img>"'>
It can’t be done, as an opening tag in html can contain anything in its attributes, even JavaScript (e.g. onclick handler).
??? Non sequitur
You can’t parse every html opening tag with regex, because a html opening tag doesn’t have a set structure. How would you match, with regex, this opening tag?
<mytag myattribute="<value of \"myattribute\">" >
Is this valid HTML? My understanding is that that attribute value needs to be escaped, i.e.
<value of \"myattribute\">
.The quote must not be escaped when you start with a single quote. The rest doesn’t. This is valid and tested:
<img alt='my "<img>"'>