Class FormatModule

    • Constructor Detail

      • FormatModule

        public FormatModule()
    • Method Detail

      • readContent

        public void readContent​(ClassifiableContentIF cc,
                                TextHandlerIF handler)
        Description copied from interface: FormatModuleIF
        INTERNAL: Reads and analyzes the classifiable content and triggers callbacks on the text handler to identify the text and the structure of the classifiable content.
        Specified by:
        readContent in interface FormatModuleIF
      • matchesExtension

        public static boolean matchesExtension​(String uri,
                                               String[] extensions)
      • getCharSetName

        public static String getCharSetName​(int charSet)
      • getOffset

        public static int getOffset​(int charSet)
      • detectCharSet

        public static int detectCharSet​(byte[] content)
      • getBytes

        public static byte[] getBytes​(String s)
      • getBytes

        public static byte[][] getBytes​(String[] s)
      • startsWith

        public static boolean startsWith​(byte[] content,
                                         byte[] s)
      • startsWithSkipWhitespace

        public static boolean startsWithSkipWhitespace​(byte[] content,
                                                       byte[][] ss)
      • startsWithSkipWhitespace

        public static boolean startsWithSkipWhitespace​(byte[] content,
                                                       byte[] s)