Aug-27-2022, 05:11 PM
hi,
i have some page that "the content" do not have specific tag(<something><like><this>) except for the <HTML>,
so I can't use the soup to get the content,
the other thing, like
if it is a single line I can do it.
but they are multi-line, the mult-iline is so random, it can be 1 line or dozens line,
so, how to regex this?
i have some page that "the content" do not have specific tag(<something><like><this>) except for the <HTML>,
so I can't use the soup to get the content,
the other thing, like
Error:<div>
random content that i don't needed
another random line
</div>
I like to remove them, anything start with <random> and end with </random>, that include <div>random</div>,<script language....>thescript</script> or anything else, except the <html>to</html>if it is a single line I can do it.
but they are multi-line, the mult-iline is so random, it can be 1 line or dozens line,
so, how to regex this?