SlideShare une entreprise Scribd logo
1  sur  13
Digitizing books
1
STEPS, HINTS, VIDEOS
What is the digitization?
2

 Digitization is the process of converting

information into a digital format. In this format,
information is organized into discrete units of data
that can be separately addressed. This is
the binary data that computers and many devices
with computing capacity can process.
 http://whatis.techtarget.com/definition/0,,sid9_g
ci896692,00.html
 See also: http://en.wikipedia.org/wiki/Digitizing
Steps of digitization
3

 1. Choose the book you want to digitize.
 2. Choose an OCR software (GO!)
 3. Scan your book (Choose the devise. Scanner,








compact device, digital camera, IRIScan) (GO!)
4. Optical Character Recognition (image)
5. Correction (image1) (image2)
6. Save as a text searchable PDF document
See another versions:
http://www.inquisition.ca/en/info/artic/comment_
numeriser.htm
http://dlg.galileo.usg.edu/guide.html#01
Text and images
4

 Text and images can be digitized similarly:

a scanner captures an image (which may be an image
of text) and converts it to an image file, such as
a bitmap. An optical character recognition (OCR)
program analyzes a text image for light and dark
areas in order to identify each alphabetic letter or
numeric digit, and converts each character into
an ASCII code.
Choose an OCR software
5

 There are a lot of softwares to digitize your





documents.
On Wikipedia there is comparison list of optical
character recognition softwares. Check it out!
http://en.wikipedia.org/wiki/List_of_optical_chara
cter_recognition_software
(I recommend you the ABBYY FineReader.)
If you don’t want to buy (or download) a
software, here’s a free online OCR:
http://www.newocr.com/
What is OCR?
6

 OCR (optical character recognition) is the

recognition of printed or written text characters by
a computer. This involves photoscanning of the
text character-by-character, analysis of the
scanned-in image, and then translation of the
character image into character codes, such as
ASCII, commonly used in data processing.
 http://searchciomidmarket.techtarget.com/definition/OCR
 Read more:
http://en.wikipedia.org/wiki/Optical_character_r
ecognition
What is ASCII?
7

 ASCII (American Standard Code for Information

Interchange) is the most common format for
text files in computers and on the Internet. In an
ASCII file, each alphabetic, numeric, or special
character is represented with a 7-bit binary number
(a string of seven 0s or 1s). 128 possible characters
are defined.
 In: http://searchciomidmarket.techtarget.com/definition/ASCII
How to scan the book
8

 With scanner: http://www.wikihow.com/Scan-a






Book
http://www.proportionalreading.com/scan.html
With one compact device:
http://www.ehow.com/how_6950098_scan-bookpdf-format.html
With digital camera:
http://www.wikihow.com/Scan-a-Book-With-aDigital-Camera
With IRIScan:
http://www.youtube.com/watch?v=9bgcDHLe3Xg
Optical Character Recognition
9
Correction image 1
10
Correction image 2
11
Videos
12

 How to digitize a book:










http://www.youtube.com/watch?v=-M95Ob4kIak
How to chop and scan a book:
http://www.youtube.com/watch?v=8tx2JmW_p4c
Scanning text using OCR software:
http://www.youtube.com/watch?v=_SwrGtSY4-c
How to OCR PDFs easily with Acrobat Batch OCR:
http://www.youtube.com/watch?v=V6Iz3U5X-SU
How to digitize a million books
http://www.youtube.com/watch?v=OlKhKyTS23E
How to put a scanned doc into PDF format
13

 http://www.ehow.com/how_8563246_put-scanned-

document-pdf-format.html
 Some OCR softwares include
PDF format to save.
 Have a good reading on
your digital device!



Made by Mario Laskovics (2012.04.03)

Contenu connexe

Similaire à Digitizing books

Optical character recognization word
Optical character recognization wordOptical character recognization word
Optical character recognization word
Dhana K
 
Reading System for the Blind PPT
Reading System for the Blind PPTReading System for the Blind PPT
Reading System for the Blind PPT
Binayak Ghosh
 
optical character recognition system
optical character recognition systemoptical character recognition system
optical character recognition system
Vijay Apurva
 
OPTICAL CHARACTER RECOGNIZATION NEERAJ.pptx
OPTICAL CHARACTER RECOGNIZATION  NEERAJ.pptxOPTICAL CHARACTER RECOGNIZATION  NEERAJ.pptx
OPTICAL CHARACTER RECOGNIZATION NEERAJ.pptx
NeerajBudhlakoti
 
Existco Scan and File Utility
Existco Scan and File UtilityExistco Scan and File Utility
Existco Scan and File Utility
Existco Pty Ltd
 

Similaire à Digitizing books (20)

Barcode Educational Guide - IDAutomation.com
Barcode Educational Guide - IDAutomation.com Barcode Educational Guide - IDAutomation.com
Barcode Educational Guide - IDAutomation.com
 
Optical character recognization word
Optical character recognization wordOptical character recognization word
Optical character recognization word
 
Paper based interaction
Paper based interactionPaper based interaction
Paper based interaction
 
Reading System for the Blind PPT
Reading System for the Blind PPTReading System for the Blind PPT
Reading System for the Blind PPT
 
Intelligent Data Extraction, Turning Content into Data, A Look at Advanced Ca...
Intelligent Data Extraction, Turning Content into Data, A Look at Advanced Ca...Intelligent Data Extraction, Turning Content into Data, A Look at Advanced Ca...
Intelligent Data Extraction, Turning Content into Data, A Look at Advanced Ca...
 
A12REVIEW.pptx
A12REVIEW.pptxA12REVIEW.pptx
A12REVIEW.pptx
 
optical character recognition system
optical character recognition systemoptical character recognition system
optical character recognition system
 
What is Batch Document Processing? A tutorial for document capture.
What is Batch Document Processing?  A tutorial for document capture.What is Batch Document Processing?  A tutorial for document capture.
What is Batch Document Processing? A tutorial for document capture.
 
Optical Character Recognition (OCR) System
Optical Character Recognition (OCR) SystemOptical Character Recognition (OCR) System
Optical Character Recognition (OCR) System
 
D017222226
D017222226D017222226
D017222226
 
OCR 's Functions
OCR 's FunctionsOCR 's Functions
OCR 's Functions
 
OPTICAL CHARACTER RECOGNIZATION NEERAJ.pptx
OPTICAL CHARACTER RECOGNIZATION  NEERAJ.pptxOPTICAL CHARACTER RECOGNIZATION  NEERAJ.pptx
OPTICAL CHARACTER RECOGNIZATION NEERAJ.pptx
 
Optical character recognition IEEE Paper Study
Optical character recognition IEEE Paper StudyOptical character recognition IEEE Paper Study
Optical character recognition IEEE Paper Study
 
Don't just pdf, Smart PDF
Don't just pdf, Smart PDFDon't just pdf, Smart PDF
Don't just pdf, Smart PDF
 
Product Recognition using Label and Barcodes
Product Recognition using Label and BarcodesProduct Recognition using Label and Barcodes
Product Recognition using Label and Barcodes
 
Project report of OCR Recognition
Project report of OCR RecognitionProject report of OCR Recognition
Project report of OCR Recognition
 
Existco Scan and File Utility
Existco Scan and File UtilityExistco Scan and File Utility
Existco Scan and File Utility
 
Optical Character Recognition (OCR)
Optical Character Recognition (OCR)Optical Character Recognition (OCR)
Optical Character Recognition (OCR)
 
IRJET- Optical Character Recognition for Blind using Raspberry Pi
IRJET- Optical Character Recognition for Blind using Raspberry PiIRJET- Optical Character Recognition for Blind using Raspberry Pi
IRJET- Optical Character Recognition for Blind using Raspberry Pi
 
50120130406005
5012013040600550120130406005
50120130406005
 

Plus de Nyugat-magyarországi Egyetem, Savaria Egyetemi Központ

Plus de Nyugat-magyarországi Egyetem, Savaria Egyetemi Központ (9)

Jókai Mór Town Library 2007
Jókai Mór Town Library 2007Jókai Mór Town Library 2007
Jókai Mór Town Library 2007
 
A pápai városi könyvtár szervezeti kultúrája
A pápai városi könyvtár szervezeti kultúrájaA pápai városi könyvtár szervezeti kultúrája
A pápai városi könyvtár szervezeti kultúrája
 
Tudástranszfer irodák bemutatása
Tudástranszfer irodák bemutatásaTudástranszfer irodák bemutatása
Tudástranszfer irodák bemutatása
 
Tudásmenedzsment konferenciák, tanfolyamok témái
Tudásmenedzsment konferenciák, tanfolyamok témáiTudásmenedzsment konferenciák, tanfolyamok témái
Tudásmenedzsment konferenciák, tanfolyamok témái
 
Tudásmenedzsment a Galgóczi Erzsébet Városi Könyvtárban
Tudásmenedzsment a Galgóczi Erzsébet Városi KönyvtárbanTudásmenedzsment a Galgóczi Erzsébet Városi Könyvtárban
Tudásmenedzsment a Galgóczi Erzsébet Városi Könyvtárban
 
Erasmus-sal külföldön
Erasmus-sal külföldönErasmus-sal külföldön
Erasmus-sal külföldön
 
University of Wisconsin-Madison's School of Library and Information Studies
University of Wisconsin-Madison's School of Library and Information StudiesUniversity of Wisconsin-Madison's School of Library and Information Studies
University of Wisconsin-Madison's School of Library and Information Studies
 
Könyvtári statisztika
Könyvtári statisztikaKönyvtári statisztika
Könyvtári statisztika
 
Dokumentumok általános megnevezése
Dokumentumok általános megnevezéseDokumentumok általános megnevezése
Dokumentumok általános megnevezése
 

Dernier

Dernier (20)

Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Philosophy of china and it's charactistics
Philosophy of china and it's charactisticsPhilosophy of china and it's charactistics
Philosophy of china and it's charactistics
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
Tatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf artsTatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf arts
 
Basic Intentional Injuries Health Education
Basic Intentional Injuries Health EducationBasic Intentional Injuries Health Education
Basic Intentional Injuries Health Education
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 

Digitizing books

  • 2. What is the digitization? 2  Digitization is the process of converting information into a digital format. In this format, information is organized into discrete units of data that can be separately addressed. This is the binary data that computers and many devices with computing capacity can process.  http://whatis.techtarget.com/definition/0,,sid9_g ci896692,00.html  See also: http://en.wikipedia.org/wiki/Digitizing
  • 3. Steps of digitization 3  1. Choose the book you want to digitize.  2. Choose an OCR software (GO!)  3. Scan your book (Choose the devise. Scanner,      compact device, digital camera, IRIScan) (GO!) 4. Optical Character Recognition (image) 5. Correction (image1) (image2) 6. Save as a text searchable PDF document See another versions: http://www.inquisition.ca/en/info/artic/comment_ numeriser.htm http://dlg.galileo.usg.edu/guide.html#01
  • 4. Text and images 4  Text and images can be digitized similarly: a scanner captures an image (which may be an image of text) and converts it to an image file, such as a bitmap. An optical character recognition (OCR) program analyzes a text image for light and dark areas in order to identify each alphabetic letter or numeric digit, and converts each character into an ASCII code.
  • 5. Choose an OCR software 5  There are a lot of softwares to digitize your     documents. On Wikipedia there is comparison list of optical character recognition softwares. Check it out! http://en.wikipedia.org/wiki/List_of_optical_chara cter_recognition_software (I recommend you the ABBYY FineReader.) If you don’t want to buy (or download) a software, here’s a free online OCR: http://www.newocr.com/
  • 6. What is OCR? 6  OCR (optical character recognition) is the recognition of printed or written text characters by a computer. This involves photoscanning of the text character-by-character, analysis of the scanned-in image, and then translation of the character image into character codes, such as ASCII, commonly used in data processing.  http://searchciomidmarket.techtarget.com/definition/OCR  Read more: http://en.wikipedia.org/wiki/Optical_character_r ecognition
  • 7. What is ASCII? 7  ASCII (American Standard Code for Information Interchange) is the most common format for text files in computers and on the Internet. In an ASCII file, each alphabetic, numeric, or special character is represented with a 7-bit binary number (a string of seven 0s or 1s). 128 possible characters are defined.  In: http://searchciomidmarket.techtarget.com/definition/ASCII
  • 8. How to scan the book 8  With scanner: http://www.wikihow.com/Scan-a    Book http://www.proportionalreading.com/scan.html With one compact device: http://www.ehow.com/how_6950098_scan-bookpdf-format.html With digital camera: http://www.wikihow.com/Scan-a-Book-With-aDigital-Camera With IRIScan: http://www.youtube.com/watch?v=9bgcDHLe3Xg
  • 12. Videos 12  How to digitize a book:         http://www.youtube.com/watch?v=-M95Ob4kIak How to chop and scan a book: http://www.youtube.com/watch?v=8tx2JmW_p4c Scanning text using OCR software: http://www.youtube.com/watch?v=_SwrGtSY4-c How to OCR PDFs easily with Acrobat Batch OCR: http://www.youtube.com/watch?v=V6Iz3U5X-SU How to digitize a million books http://www.youtube.com/watch?v=OlKhKyTS23E
  • 13. How to put a scanned doc into PDF format 13  http://www.ehow.com/how_8563246_put-scanned- document-pdf-format.html  Some OCR softwares include PDF format to save.  Have a good reading on your digital device!  Made by Mario Laskovics (2012.04.03)