Page 14 - Greenstone tutorial exercises
P. 14

7.  Difficult PDF documents
                        25. Build a fresh Greenstone collection from the two files in sample_files\difficult_documents.
                            Use the default collection configuration: that is, simply gather the files into a new
                            collection, and build it.
                        These files are called No extractable text.pdf and Weird characters.pdf—their names hint at the
                        problems they will cause!

                        26. Now preview the collection. The titles and filenames lists show only one of the documents.
                            When you click the “text” icon to look at the text extracted from that document, it’s
                            garbage. During the building process this message appeared: “One document was processed
                            and included in the collection; one was rejected.”
                   Modes in the Librarian Interface
                        The Librarian Interface can operate in different modes. So far, you have been using the default
                        mode, called “Librarian.”
                        27. Use the Preferences item on the File menu to switch to Expert mode and then build the
                            collection again. The Create panel looks different in Expert mode because it gives more
                            options: locate the Build Collection button, near the bottom of the window, and click it.
                            Now a message appears saying that the file could not be processed, and why.

                        28. We recommend that you switch back to Librarian mode for subsequent exercises, to avoid
                            confusion.






















































                                                                                                    14
   9   10   11   12   13   14   15   16   17   18   19