Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Macette

macrumors 6502
Original poster
Mar 5, 2002
472
0
Melbourne
Howdy,

Does anyone know of software for Mac OSX that I can use to strip html tags from a document and turn it into plain text? I've got a whole lot of horribly marked-up pages on a site that I'm redesigning using CSS, and I don't want to have to go through them one by one, getting rid of all the crud.

Something Applescriptable would be cool...
 
Re: Software for turning HTML into Plain Text?

Originally posted by Macette
Howdy,

Does anyone know of software for Mac OSX that I can use to strip html tags from a document and turn it into plain text? I've got a whole lot of horribly marked-up pages on a site that I'm redesigning using CSS, and I don't want to have to go through them one by one, getting rid of all the crud.

Something Applescriptable would be cool...

a few lines of PHP could do it for you using a regexp to strip out everything inbetween and including < and >'s
 
thanks - that's a good idea. Although it will probably take me as long to write the script as it would just to delete the tags by hand (but at least I'd learn something...)

ta.
 
thanks! i've been doing a bit of php stuff recently and feel like i'm getting the hang of the syntax, but of course i still don't know a quarter of the available functions. this one is good.
 
Originally posted by Macette
thanks! i've been doing a bit of php stuff recently and feel like i'm getting the hang of the syntax, but of course i still don't know a quarter of the available functions. this one is good.

i doubt many programmers do! there's no need to know many of them as the php site tells you all you need.

with php (and most other languages) all you need is an idea of how its structured then you can find a function that'll do the job.

best piece of advice for any programming? FLOW DIAGRAMS!
 
How about regexp matching in BBEdit?

It'd be easier to play with expressions and see which works best in there than in PHP, IMO.
 
Originally posted by Rower_CPU
How about regexp matching in BBEdit?

It'd be easier to play with expressions and see which works best in there than in PHP, IMO.

good idea but BBEdit costs money, and solving problems in PHP does your programming the world of good - its amazing how versatile PHP actually is, i've started using it for a lot of non-webbased stuff.
 
thanks for suggestions - I'd thought of the browser one already, but i've got, well, ten years of quarterly journals at about 12-15 articles an issue... so that's... 600 pages or something.

is it possible to applescript something like that? I've only ever used applescripts made by other people, so I'm not really sure of what i can do with it.

i've got bbedit - the full version! registered! paid for! - but again, am looking to do this thing as a batch, and haven't worked out how to make bbedit do that.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.