sjkscan package

Submodules

sjkscan.scan module

sjkscan.scan.run_scan(output_directory)

Run scanimage in batch mode.

Parameters:output_directory (string) – directory to write scanned images to
sjkscan.scan.scan()

Scan documents.

Documents are placed in data_dir/YYYY-MM-DD_HH-MM-SS.unfinished. Once the scan has been completed, the ‘.unfinished’ is removed.

sjkscan.postprocessing module

sjkscan.postprocessing.is_blank(filename)

Check if image is blank.

Return true if filename is a blank image. This is a slightly modified version of Vinatha Ekanayake’s is_blank(), which is part of Scanpdf (https://github.com/virantha/scanpdf) and licensed under the Apache license.

Parameters:filename (string) – file name of image to check
Returns:True if image is blank, False otherwise.
sjkscan.postprocessing.merge_pdfs(inputs, output)

Merge selected pdfs.

Parameters:
  • inputs (list) – files to concatenate
  • output (string) – name of file to write
sjkscan.postprocessing.merge_pdfs_in_dir(directory, output)

Read all pdf files in directory and create one merged output.

Parameters:
  • directory (string) – directory containing pdf files to be merged
  • output (string) – filename of new merged pdf
sjkscan.postprocessing.move_blanks(input_dir, output_dir)

Move blank .pnm’s in input_dir to output_dir

Parameters:
  • input_dir (string) – directory to check for blank .pnm files
  • output_dir (string) – where to move blank .pnm files
Returns:

number of blank pages moved

Return type:

int

sjkscan.postprocessing.ocr(filename, language)

Perform OCR on file using Tesseract.

Parameters:
  • filename (string) – file to perform OCR on
  • language (string) – language(s) expected to be used in file
sjkscan.postprocessing.ocr_pnms_in_dir(directory, language)

Perform OCR on all pnm files in given directory.

Parameters:
  • directory (string) – directory in which all pnm files will be OCR:ed
  • language (string) – language(s) expected to be used in files
sjkscan.postprocessing.remove_if_blank(filename)

Remove file if it is blank.

This is useful when scanning in duplex mode using a backend that doesn’t support skipping blank pages.

Parameters:filename (string) – name of file to remove, if blank
sjkscan.postprocessing.rotate_all_images_in_dir(dirname, degrees)

Rotate all files in directory.

Parameters:
  • dirname (string) – name of directory in which files should be rotated
  • degrees (int) – number of degrees to rotate
sjkscan.postprocessing.rotate_image(filename, degrees)

Rotate image given amount of degrees.

Parameters:
  • filename (string) – file to rotate
  • degrees (int) – amount of degrees to rotate
sjkscan.postprocessing.scand()

Polls DATA_DIR for finished scans. Once found, scand will:

  • Move blank images to subdir blank/
  • Rotate remaining images
  • OCR remaining images
  • Merge resulting pdf files
  • Move the directory to INBOX
sjkscan.postprocessing.unpaper(filename)

Process file with unpaper and delete original.

Parameters:filename – TODO
sjkscan.postprocessing.unpaper_dir(directory, extension=None)

Run unpaper on all files with given extension in directory

Parameters:
  • directory (string) – directory to process
  • extension (string) – extension of files to run unpaper on

sjkscan.utils module

sjkscan.utils.read_config(config_file=None)

Read and populate utils.config

Config values can be accessed from within other modules:

from utils import config print(config[‘Paths’].get(‘data’))

given that read_conf() has been called sometime before.

Parameters:config_file (string) – optional filename to read, otherwise looks for sjkscan.conf in bundle.
sjkscan.utils.run_cmd(args)

Run shell command and return its output.

Parameters:args – list or string of shell command and arguments
Returns:output of command