Difference between revisions of "Python: PDF"
		
		
		
		
		
		Jump to navigation
		Jump to search
		
				
		
		
	
Onnowpurbo (talk | contribs)  | 
				Onnowpurbo (talk | contribs)   | 
				||
| Line 20: | Line 20: | ||
  pip install textract  |   pip install textract  | ||
| − | + | ||
  # for read pdf  |   # for read pdf  | ||
  import textract  |   import textract  | ||
  text = textract.process('path/to/pdf/file', method='pdfminer')  |   text = textract.process('path/to/pdf/file', method='pdfminer')  | ||
| − | |||
==Referensi==  | ==Referensi==  | ||
Latest revision as of 05:29, 25 October 2018
pyPDF2
#install pyPDF2
pip install PyPDF2
# importing all the required modules
import PyPDF2
# creating an object 
file = open('example.pdf', 'rb')
# creating a pdf reader object
fileReader = PyPDF2.PdfFileReader(file)
# print the number of pages in pdf file
print(fileReader.numPages)
textract
pip install textract
# for read pdf
import textract
text = textract.process('path/to/pdf/file', method='pdfminer')