Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Maxal

macrumors newbie
Original poster
Aug 22, 2014
9
0
Hi

Being an absolute newbie in Applescriptting, I have run into a nagging situation. I get MS Word files from a client which have some unwanted text in the beginning of the files and at the end of files. I have the following applescript to delete 4 unwanted sentences at the top of every file:

Code:
tell application "Microsoft Word"
	
	set aRange to sentence 1 of active document
	set aRange to set range aRange start (start of content of aRange) ¬
		end (end of content of sentence 4 of active document)
	delete aRange
	
end tell

The problem is that I don't know how to make this script delete last 4 sentences also.

I would also like to know how could I delete a certain amount of paragraphs from bottom up in an MS Word file.

Thanks in advance :)
Max
 
edit - this is probably simpler?


to delete the first four sentences

Code:
tell application "Microsoft Word"
	
	delete (sentences 1 thru 4 of active document)
	
end tell


to delete the last four sentences

Code:
tell application "Microsoft Word"
	
	repeat 4 times
		delete sentence -1 of active document
	end repeat
	
end tell

sentence -1 is the last sentence of the document, so repeating that four times deletes the last four sentences


code below will delete the last 3 paragraphs

just change the value of set theParasToDelete to 3
from 3 to whatever number of paragraphs you want to remove from the end of the document.


Code:
tell application "Microsoft Word"
	
	set theParasToDelete to 3
	
	set theParasToDelete to theParasToDelete - 1
	
        set theLastPara to count every paragraph of active document
	set theStartPara to theLastPara - theParasToDelete
	
	set startpoint to start of content of the text object of paragraph theStartPara of active document
	set endpoint to end of content of the text object of paragraph theLastPara of active document
	
	
	tell active document
		set theRange to create range start startpoint end endpoint
		delete theRange
	end tell
	
end tell
 
Last edited:
Thanks a lot snorkelman, it worked like a charm :)

Please allow me to seek some more help from you:)

Is there a way to search a selected text range for some certain elements in an MS Word document?

For instance, I want to search the first 4 paragraphs of a document to see if any of the paragraphs starts with (Add, with By or with * and delete those paragraphs?

Thank again snorkelman :)
 
critical line would be

Code:
if content of the text object of paragraph 1 of the active document starts with "Add " then

declare and set a variable for each paragraphs deletion status (initially set as 0)

apply all three tests to para 1 any matches set the deletion status for that paragraph to 1

repeat for para 2, then para 3 then para 4

Once you've done that check each paragraphs deletion status and delete as appropriate in reverse order

(as that'll stop the shifts in paragraph positions caused by carrying out deletions from affecting the position of the next paragraph you have to delete)

code below could be reduced a bit further with loops but should do the job as is

Code:
tell application "Microsoft Word"
	
	
	set delPara1 to 0
	set delPara2 to 0
	set delPara3 to 0
	set delPara4 to 0
	
	
	if content of the text object of paragraph 1 of the active document starts with "Add " then
		set delPara1 to 1
	end if
	
	if content of the text object of paragraph 1 of the active document starts with "By " then
		set delPara1 to 1
	end if
	
	if content of the text object of paragraph 1 of the active document starts with "* " then
		set delPara1 to 1
	end if
	
	
	if content of the text object of paragraph 2 of the active document starts with "Add " then
		set delPara2 to 1
	end if
	
	if content of the text object of paragraph 2 of the active document starts with "By " then
		set delPara2 to 1
	end if
	
	if content of the text object of paragraph 2 of the active document starts with "* " then
		set delPara2 to 1
	end if
	
	
	if content of the text object of paragraph 3 of the active document starts with "Add " then
		set delPara3 to 1
	end if
	
	if content of the text object of paragraph 3 of the active document starts with "By " then
		set delPara3 to 1
	end if
	
	if content of the text object of paragraph 3 of the active document starts with "* " then
		set delPara3 to 1
	end if
	
	
	
	if content of the text object of paragraph 4 of the active document starts with "Add " then
		set delPara4 to 1
	end if
	
	if content of the text object of paragraph 4 of the active document starts with "By " then
		set delPara4 to 1
	end if
	
	if content of the text object of paragraph 4 of the active document starts with "* " then
		set delPara4 to 1
	end if
	
	
	
	
	if delPara4 is equal to 1 then
		set startpoint to start of content of the text object of paragraph 4 of active document
		set endpoint to end of content of the text object of paragraph 4 of active document
		
		tell active document
			set theRange to create range start startpoint end endpoint
			delete theRange
		end tell
	end if
	
	if delPara3 is equal to 1 then
		
		set startpoint to start of content of the text object of paragraph 3 of active document
		set endpoint to end of content of the text object of paragraph 3 of active document
		
		tell active document
			set theRange to create range start startpoint end endpoint
			delete theRange
		end tell
	end if
	
	
	if delPara2 is equal to 1 then
		set startpoint to start of content of the text object of paragraph 2 of active document
		set endpoint to end of content of the text object of paragraph 2 of active document
		
		tell active document
			set theRange to create range start startpoint end endpoint
			delete theRange
		end tell
	end if
	
	
	if delPara1 is equal to 1 then
		set startpoint to start of content of the text object of paragraph 1 of active document
		set endpoint to end of content of the text object of paragraph 1 of active document
		
		tell active document
			set theRange to create range start startpoint end endpoint
			delete theRange
		end tell
	end if
	
	
end tell
 
Magic — pure magic. Worked without any flaws :) Short of words to thank you man :)

By the way, I was struggling with an experiment here: If

I was trying something which I am sure won't make any sense at all:

Code:
tell application "Microsoft Word"
		
	if content of the text object of paragraph 2 of the active document contains ", Aug 25 " then
		delete
	end if
	
end tell

In the second paragraph of every Word file, I get current date in this format: ", Aug 25 " (without quote marks) which I have to delete in many files, a big headache. The next day it will be ", Aug 26 ". Looks like it will require some dynamic scripting to eliminate such a string which changes every day.

By the way, thanx a ton for all your help :)
 
By the way, thanx a ton for all your help :)


you're welcome, the script below should take care of creating the search phrase, it uses current date so as long as documents your checking are current date it should work OK. Just be sure to check the if statements to make sure the short month values I picked (Jun,Jul,Aug etc) all suit what's in use in your documents.

As with previous scripts there may well be much easier/faster ways to script the Word tell block part so consider this a quick n dirty fix (I don't usually do Word scripting)

Code:
set theMonth to month of (current date) as string

if theMonth is equal to "January" then
	set theSearchMonth to "Jan"
end if

if theMonth is equal to "February" then
	set theSearchMonth to "Feb"
end if

if theMonth is equal to "March" then
	set theSearchMonth to "Mar"
end if

if theMonth is equal to "April" then
	set theSearchMonth to "Apr"
end if

if theMonth is equal to "May" then
	set theSearchMonth to "May"
end if

if theMonth is equal to "June" then
	set theSearchMonth to "Jun"
end if


if theMonth is equal to "July" then
	set theSearchMonth to "Jul"
end if


if theMonth is equal to "August" then
	set theSearchMonth to "Aug"
end if

if theMonth is equal to "September" then
	set theSearchMonth to "Sep"
end if

if theMonth is equal to "October" then
	set theSearchMonth to "Oct"
end if

if theMonth is equal to "November" then
	set theSearchMonth to "Nov"
end if

if theMonth is equal to "December" then
	set theSearchMonth to "Dec"
end if

set dateSearch to ", " & theSearchMonth & " " & day of (current date) & " " as string


tell application "Microsoft Word"
	
	if content of the text object of paragraph 2 of the active document contains dateSearch then
		
		set theString to content of the text object of paragraph 2 of the active document
		set AppleScript's text item delimiters to dateSearch
		set theList to text items of theString
		set AppleScript's text item delimiters to ""
		
		set startCount to count item 1 of theList
		set selectionCount to count dateSearch
		
		set startpoint to (start of content of text object of paragraph 2 of the active document) + startCount
		set endpoint to startpoint + selectionCount
		tell active document
			set theRange to create range start startpoint end endpoint
			delete theRange
		end tell
		
	end if
	
end tell

the Word tell block deletes the search phrase from para 2 if it finds it in there, leaving rest of the paragraph intact,

if you want to delete the whole of para 2 any time it finds the search-phrase is contained in it then replace that Word tell block with this one:

Code:
tell application "Microsoft Word"
	
	if content of the text object of paragraph 2 of the active document contains dateSearch then

		set startpoint to start of content of text object of paragraph 2 of the active document
		set endpoint to end of content of text object of paragraph 2 of the active document
		
		tell active document
			set theRange to create range start startpoint end endpoint
			delete theRange
		end tell
		
	end if
	
end tell
 
Last edited:
@snorkelman, I will be saving at least 2 hours on daily basis because of the help you extended to me. Every piece of your code worked flawlessly. Your username should have been superman :)

By the way, the last script which you wrote for deletion in four top paragraphs, could also be used to delete something anywhere within a file? Or it will be something else for find and delete?
 
@snorkelman, I will be saving at least 2 hours on daily basis because of the help you extended to me. Every piece of your code worked flawlessly. Your username should have been superman :)

By the way, the last script which you wrote for deletion in four top paragraphs, could also be used to delete something anywhere within a file? Or it will be something else for find and delete?

That's good to hear they're working out for you, anything that saves someone a bit of time is a result in my book :)

when you say extending the one with search and delete in first four paragraphs is desired behaviour of the extended version going to be

a ) to delete each paragraph when you find an instance of your search or

b ) just remove the search phrase itself from whichever paragraph its found in?

lmk which and I'll take another look (if nothing else I'm sure I can come up with something a bit neater than the original versions)

The logic itself is pretty trivial its adapting it to applescript (more particularly Words applescript limitations) that's the only awkward bit.

Been working today and back in again tomorrow so probably Thurs before i get another chance to look at it for you.
 
That's good to hear they're working out for you, anything that saves someone a bit of time is a result in my book :)

That's true. Not only me, there are around a dozen people who would be saving the same amount of time on daily basis. We need to process 100s of MS Word files everyday to weed out some unwanted text in each file. Thanks to your help, most of the process is now automated :)


when you say extending the one with search and delete in first four paragraphs is desired behaviour of the extended version going to be

a ) to delete each paragraph when you find an instance of your search or

b ) just remove the search phrase itself from whichever paragraph its found in?


It's b, I would want to be able to delete the search term wherever it is found. Currently, I am using the following script to find something a delete it wherever it is found.

Code:
set findRange to find object of selection
	tell findRange
		execute find find text "sometext" replace with "" replace replace all
end tell

The code works fine, but it leaves double spacing. But I use the same code at the end to find double spacing and replace it with single space. So it works :)

Secondly, can we close an untitled Word document as plain text (.txt) with default File Conversion settings so that the setting prompt does not pop up at all?
 
Secondly, can we close an untitled Word document as plain text (.txt) with default File Conversion settings so that the setting prompt does not pop up at all?

Do you mean saving an untitled Word document as plain text (.txt)? You can do something like this :

Code:
tell application "Microsoft Word"
	set myDoc to active document
	save as myDoc file name "TestDocument" file format format text
end tell

Note : The open command has a confirm conversions parameter.
 
Do you mean saving an untitled Word document as plain text (.txt)? You can do something like this :

Code:
tell application "Microsoft Word"
	set myDoc to active document
	save as myDoc file name "TestDocument" file format format text
end tell

Note : The open command has a confirm conversions parameter.


Hey kryten2, I want a script which, once activated, brings me the Save prompt to let me give the document a name of my choice, but it has the simple text option preselected, instead of .docx extension. And the following File Conversion prompt is not required.

Your script saves the file with set file name and location is also set. If I get the Save prompt, I would be able to give the files name of my choice and decide where to save them. Thanks for your help :)
 
@snorkelman, could the script that you posted in post#6, also be slightly modified to search for month and date in this format: "Aug. 28--" (without quotes). Next month it will be like: "Sept. 1--". I tried to change the format but couldn't get the script to work to find and delete this format. The string, however, remains the second paragraph of the document.
 
Hey kryten2, I want a script which, once activated, brings me the Save prompt to let me give the document a name of my choice, but it has the simple text option preselected, instead of .docx extension. And the following File Conversion prompt is not required.

Your script saves the file with set file name and location is also set. If I get the Save prompt, I would be able to give the files name of my choice and decide where to save them. Thanks for your help :)

Try this :

built-in Word dialog box :

Code:
tell application "Microsoft Word"
	activate
	(*
	Returns of sets if the project gallery dialog will be shown when starting Microsoft Word.
*)
	set startup dialog to false
	(*
	Returns or sets the default format. Common settings include: document = WordDocument, document template = Template, Word 97-2004 document = Doc97, Word XML document = XML, web page = Html, Text only = Text, RTF = Rtf, unicode text = Unicode.
*)
	set default save format to "Text"
	(*
	show (verb)Displays and carries out actions initiated in the specified built-in Word dialog box. Returns a number which indicates the button used to dismiss the dialog box.
*)
	show (get dialog dialog file save as)
end tell

or StandardAdditions choose file name :

Code:
(*
choose file name v : Get a new file reference from the user, without creating the file
choose file name
[with prompt text] : the prompt to be displayed in the dialog box
[default name text] : the default name for the new file
[default location alias] : the default file location
→ file : the file the user specified
*)

tell application "Microsoft Word"
	set myDoc to active document
	save as myDoc file name ((choose file name) as text) file format format text
end tell
 
Last edited:
@snorkelman, could the script that you posted in post#6, also be slightly modified to search for month and date in this format: "Aug. 28--" (without quotes). Next month it will be like: "Sept. 1--". I tried to change the format but couldn't get the script to work to find and delete this format. The string, however, remains the second paragraph of the document.

hey maxal

alter the set dateSearch line in that script so it reads

Code:
set dateSearch to theSearchMonth & ". " & day of (current date) & "--" as string

should do the trick, don't forget to alter the set searchMonth for September so that it sets it to Sept rather than Sep that I had in my version

still not had a chance to look at the other one for you, got roped into three days at work this week rather than just two will see if I can get it sorted for tomoz :)



edit one thing to watch for is how Word's autoformat has handled the -- part of the original document; it may be treating it as an 'em dash' and automatically joining the two -- characters into one bigger dash character

to guard against that I'd modify the code from set dateSearch onwards as follows, so that finding either style will work

Code:
set dateSearch to theSearchMonth & ". " & day of (current date) & "--" as string
set dateSearch2 to theSearchMonth & ". " & day of (current date) & "—" as string
set deleteIt to 0


tell application "Microsoft Word"
	
	if content of the text object of paragraph 2 of the active document contains dateSearch then
		set deleteIt to 1
	end if
	
	if content of the text object of paragraph 2 of the active document contains dateSearch2 then
		set deleteIt to 2
	end if
	
	
	if deleteIt is greater than 0 then
		
		set theString to content of the text object of paragraph 2 of the active document
		
		if deleteIt is greater than 1 then
			set dateSearch to dateSearch2
		end if
		
		set AppleScript's text item delimiters to dateSearch
		set theList to text items of theString
		set AppleScript's text item delimiters to ""
		
		set startCount to count item 1 of theList
		set selectionCount to count dateSearch
		
		set startpoint to (start of content of text object of paragraph 2 of the active document) + startCount
		set endpoint to startpoint + selectionCount
		tell active document
			set theRange to create range start startpoint end endpoint
			delete theRange
		end tell
		
	end if
	
end tell
 
Last edited:
@kryten2, both of your scripts worked perfectly. I have started to use the first one. No issues. Thank you very for your help :)

----------

@snorkelman, yesssssssss, it works :) No hassles. Great job again. A bundle of thanks once again :) For the other script, no problems....take your time. You have already been so much helpful :)
 
Maxal, following should let you search and replace selectively within a range rather than across a whole document

set the search term, the replacement value, the paragraph you want to begin searching from and how many paragraphs you want to search

Code:
set theSearchTerm to " SEARCH"
set theReplacement to ""
set firstPara to 3
set numberOfParas to 5

tell application "Microsoft Word"
	
	set theCount to count paragraphs of active document
	set lastPara to firstPara + numberOfParas
	set i to firstPara
	repeat until i is equal to lastPara
		
		if content of the text object of paragraph i of the active document contains theSearchTerm then
			
			set theString to content of the text object of paragraph i of the active document
			set AppleScript's text item delimiters to theSearchTerm
			set theList to text items of theString
			set AppleScript's text item delimiters to ""
			set itemsCount to count items of theList
			set x to 1
			set runningTotal to 0
			
			repeat until x is equal to itemsCount
				set startCount to (count (item x of theList)) + runningTotal
				
				set selectionCount to (count theSearchTerm)
				set startpoint to (start of content of text object of paragraph i of the active document) + startCount
				set endpoint to startpoint + selectionCount
				tell active document
					set theRange to create range start startpoint end endpoint
					set content of theRange to theReplacement
				end tell
				set runningTotal to runningTotal + (count (item x of theList))
				set x to x + 1
			end repeat
			
		end if
		set i to i + 1
	end repeat
end tell
 
@snorkelman, thanx for the script once again. It's fine and does what it is suppose to do. Brilliant work man :)
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.