sjkscan package

Submodules

sjkscan.config

This module handles loading the configuration file and makes sure only proper settings are placed in the config dict.

copyright:
  1. 2016 by Svante Kvarnström
license:

BSD, see LICENSE for more details.

sjkscan.config.load_config(config_file=None)

Load the configuration file.

Configuration options will be available in dict sjkscan.conf.config. When configuration options are added, modified or removed in future releases, config_template in this function must be updated.

Parameters:config_file – file to read. Defaults to sjkscan.conf in package bundle.

sjkscan.logger

This module handles logging related tasks, such as initialising the logging environment.

copyright:
  1. 2016 by Svante Kvarnström
license:

BSD, see LICENSE for more details.

sjkscan.logger.init_logging(level)

Initialise logging.

Set up basic logging.

sjkscan.postprocessing

Implements all post processing related actions that sjkscan take on a scanned document.

copyright:
  1. 2016 by Svante Kvarnström
license:

BSD, see LICENSE for more details.

sjkscan.postprocessing.is_blank(filename)

Check if image is blank.

Return true if filename is a blank image. This is a slightly modified version of Vinatha Ekanayake’s is_blank(), which is part of Scanpdf (https://github.com/virantha/scanpdf) and licensed under the Apache license.

Parameters:filename (string) – file name of image to check
Returns:True if image is blank, False otherwise.
sjkscan.postprocessing.main(argv=None)

Polls DATA_DIR for finished scans. Once found, scand will:

  • Move blank images to subdir blank/
  • Rotate remaining images
  • OCR remaining images
  • Merge resulting pdf files
  • Move the directory to INBOX
sjkscan.postprocessing.merge_pdfs(inputs, output)

Merge selected pdfs.

Parameters:
  • inputs (list) – files to concatenate
  • output (string) – name of file to write
sjkscan.postprocessing.merge_pdfs_in_dir(directory, output)

Read all pdf files in directory and create one merged output.

Parameters:
  • directory (string) – directory containing pdf files to be merged
  • output (string) – filename of new merged pdf
sjkscan.postprocessing.move_blanks(input_dir, output_dir)

Move blank .pnm’s in input_dir to output_dir

Parameters:
  • input_dir (string) – directory to check for blank .pnm files
  • output_dir (string) – where to move blank .pnm files
Returns:

number of blank pages moved

Return type:

int

sjkscan.postprocessing.ocr(filename, language)

Perform OCR on file using Tesseract.

Parameters:
  • filename (string) – file to perform OCR on
  • language (string) – language(s) expected to be used in file
sjkscan.postprocessing.ocr_pnms_in_dir(directory, language)

Perform OCR on all pnm files in given directory.

Parameters:
  • directory (string) – directory in which all pnm files will be OCR:ed
  • language (string) – language(s) expected to be used in files
sjkscan.postprocessing.remove_if_blank(filename)

Remove file if it is blank.

This is useful when scanning in duplex mode using a backend that doesn’t support skipping blank pages.

Parameters:filename (string) – name of file to remove, if blank
sjkscan.postprocessing.rotate_all_images_in_dir(dirname, degrees)

Rotate all files in directory.

Parameters:
  • dirname (string) – name of directory in which files should be rotated
  • degrees (int) – number of degrees to rotate
sjkscan.postprocessing.rotate_image(filename, degrees)

Rotate image given amount of degrees.

Parameters:
  • filename (string) – file to rotate
  • degrees (int) – amount of degrees to rotate
sjkscan.postprocessing.unpaper(filename)

Process file with unpaper and delete original.

Parameters:filename – file to run unpaper on
sjkscan.postprocessing.unpaper_dir(directory, extension=None)

Run unpaper on all files with given extension in directory

Parameters:
  • directory (string) – directory to process
  • extension (string) – extension of files to run unpaper on

sjkscan.scan

This module provides scanning functionality.

copyright:
  1. 2016 by Svante Kvarnström
license:

BSD, see LICENSE for more details.

sjkscan.scan.main()

Scan documents.

Documents are placed in data_dir/YYYY-MM-DD_HH-MM-SS.unfinished. Once the scan has been completed, the ‘.unfinished’ is removed.

sjkscan.scan.run_scan(output_directory)

Run scanimage in batch mode.

Parameters:output_directory (string) – directory to write scanned images to

sjkscan.utils

This module provides utility functions.

copyright:
  1. 2016 by Svante Kvarnström
license:

BSD, see LICENSE for more details.

sjkscan.utils.files(dir, ext=None)

Yield regular files in directory, optionally of specific extension.

This function is a generator, and could be used like:

for f in utils.files(‘/some/directory’, ‘pnm’):
do_something_to_the_pnm(f)
Parameters:
  • dir – directory to traverse
  • ext – extension of files to list. Leading dot is ignored.
sjkscan.utils.is_scan_name(name)

Determine whether name (probably) is the name of a scan directory.

Parameters:dir – directory name to check
Returns:True if it is a scan directory, False if not.
sjkscan.utils.move(old, new)

Move file

Parameters:
  • old – file to move
  • new – new location/filename
sjkscan.utils.parse_args(argv=None)

Parse command line arguments.

Parameters:argv – array of command line arguments (sys.argv)
Returns:object with program arguments as attributes
sjkscan.utils.remove(file)

Remove file.

Parameters:file – file to remove
sjkscan.utils.run_cmd(args)

Run shell command and return its output.

Arguments and output is logged if log level DEBUG is set. Example usage:

output = run_cmd('ls -l')
for line in output.splitlines():
    print(line)
Parameters:args – list or string of shell command and arguments
Returns:output of command
sjkscan.utils.version()

Return sjkscan version. :returns: version string