Software and Scripts for Processing

some notes made by Pascal Le Roux following EPAC 2002
re-formatted for the web by John Poole in July 2003

The scripts are available on application from JACoW - but they come without support or guarantees !

 

You need the visual basic script environment, if necessary you can download the latest version (free) here:

http://msdn.microsoft.com/downloads/default.asp?url=/downloads/sample.asp?url=/msdnfiles/027/001/733/msdncompositedoc.xml

(or just search for "Microsoft Windows Script 5.6"...)

 

The documentation is here:

http://download.microsoft.com/download/winscript56/Install/5.6/W982KMeXP/EN-US/scrdoc56en.exe

or

http://www.msdn.microsoft.com/library/default.asp?url=/library/enus/script56/html/wsoriWindowsScriptHost.asp

 

To run a script, you just double click on the script file in the windows explorer.

 

In all the scripts which connect the database to fetch data, I include (with the line: <script language="VBScript" src="adovbs.wsf"/>), the file adovbs.wsf (You will find it in all zipped files attached) You can use this file as it is. It contains constants and definitions used for connecting the database...

 

 

I've used Acrobat 5.0.5 and Pitstop 5.01 and Impress Pro v2.0.

 

For setting the paper size to JACoW standards (we should call this: "resizing the media box"):

I don't use any script but I've downloaded Enfocus Pitstop server 2.0 (trial version) :

http://www.enfocus.com/products/overview.php?nr=2

 

You set a "hot" folder (the folder where your PDF are located) then you can select the actions that you set in the Action list panel in Pitstop Pro (the acrobat plug-in, not the server version...). I've created an action slightly different than the one used for the processing:

I've removed the crop box and resized the media box...the sequence for this action is the following:

Select All

Remove Crop Box

Select All

Resize Media box (Upper right 595 x 792 pt)

 

You set the output folders (success and failure) in the Folder tab.

You activate the Hot Folder in the General tab.

It starts processing all the PDF files in the Hot folder.

 

In most cases, it worked fine, but I found out about 10 PDF files out of 919 with the content shifted upward....

 

Numbering the pages:

 

I've used the script insertPageNumberFromDB.wsf (in insertpagenumber.zip)

It connects the database to retrieve the PAPER_CODE and PAGENUMBER .

It simulate the keyboard input by sending commands to acrobat and Impress Pro (similar to the winbatch script).

 

First you have to configure Impress pro.

For EPAC 2002, Two steps  were required:

Apply the string "Proceedings of EPAC 2002, Paris, France" at the top of the page

Apply the page numbers at the bottom

 

I've set up these two actions, the first one is:

Text : "Proceedings of EPAC 2002, Paris, France"

Justification : centre

Layer : Foreground

Font: Times-Roman

size: 10 normal

Page  range: All Page

Position: Custom

Horiz: 297

Vert: 35

Units: Point

From : Top

Color: Black

Page Spread : Odd and Even Page

 

and save this configuration with the name "middlec"

 

The second action:

Text : <PageNumber:3>    the number 3 will be changed by the script for every paper

Justification : centre

Layer : Foreground

Font: Times-Roman

size: 10 normal

Page  range: All Page

Position: Custom

Horiz: 297

Vert: 35

Units: Point

From : Bottom

Color: Black

Page Spread : Odd and Even Page

 

and save this configuration with the name "middlen"

 

In the script insertPageNumberFromDB.wsf,  you must change:

- the "strSourceFolder" variable. It's the location (path) of the PDFs.

- the line:

dbConnection.ConnectionString ="Driver={Microsoft ODBC for Oracle};server=WRITE_THE_SERVER_NAME_HERE;Uid=WRITE_THE_DBU_NAME_HERE;Pwd=WRITE_THE_PASSWORD_HERE"

 

You may have to change the  "WScript.Sleep" lines (the script is asynchronous,  you need to stop it with the sleep procedure, otherwise, it runs faster than Acrobat and Impress Pro are able to process the commands sent....). The time delays are a function of the processing speed on the particular computer.

I've noticed that a few papers had been processed two times due to the fact that the script may start processing the next paper although the previous paper is not yet finished ...

 

 

Hidden Fields:

I've used the "setHiddenFieldsFromDB.wsf" script (in FilllInthehiddenfieldsfromthedatabase.zip)

 

For all the PDF in a folder (the parameter of the SetHiddenFields method):

 

It reads the  name of the PDF file (e.g.:THPDO004.pdf), extracts the paper code "THPDO004" from this filename

Search for the hidden field in the database for this paper code (THPDO004)

Then using Acrobat IAC see :

Acrobat Interapplication Communication Reference (IAC):

http://partners.adobe.com/asn/developer/acrosdk/docs/iac/IACOverview.pdf

http://partners.adobe.com/asn/developer/acrosdk/docs/iac/IACReference.pdf

 

In open the PDF file, enter the hidden fields, Create the thumbnails, and save it optimized.

 

Misc. :

In the file PageCountAndPageSizechecking.zip, you will find two scripts:

getPageNumberAndPageSize.vbs which opens all the PDF files in a folder, counts the number of page, checks the size of all the pages (If it's not 595 x 792, it write a comment in the log file.)

 

getPageNumberFromDB.wsf compares the number of page in the database against the real number of page in the PDF file.

If the two numbers are different, a comment is written in the log file.

 

 

In createhtmlfilesforkeywords.zip, the script createHTMLFilesForKeywords.wsf generates (from the informations in the database) all (except the parent frame) the html files for the keywords index, see:

http://accelconf.web.cern.ch/AccelConf/e02/KEYWORDS/KEYWORDS.htm

 

In createhtmlfilesforTOC.zip, the script createHTMLFilesForTOC.wsf generates (from the informations in the database) all (except the parent frame) the html files for the Table of content, see:

http://accelconf.web.cern.ch/AccelConf/e02/TOC/TOC.htm

 

 

In createhtmlfilesforauthorsindex.zip, the script createHTMLFilesForAuthorsIndex.wsf generates (from the informations in the database) all (except the parent frame) the html files for the Authors Index, see:

http://accelconf.web.cern.ch/AccelConf/e02/AUTINDEX/AUTINDEX.htm