Warning, /frameworks/kfilemetadata/README.md is written in an unsupported language. File is not indexed.

0001 # KFileMetaData
0002 
0003 ## Introduction
0004 
0005 KFileMetaData provides a simple library for extracting the text and metadata
0006 from a number of different files. This library is typically used by file
0007 indexers to retrieve the metadata. This library can also be used by applications to write metadata.
0008 
0009 ## Using the library
0010 
0011 In order to use the library you must implement your own ExtractionResult
0012 class. Instances of this class will be passed to every applicable plugin and
0013 they will populate with the information.
0014 
0015 For convenience a SimpleExtractionResult class has been provided which stores all the
0016 data in memory and allows it to be introspected later. Most clients *should*
0017 implement their own ExtractionResult as the data can get quite large when
0018 extracting the text content from very large files.
0019 
0020 The library also supports plugins that write back data.
0021 
0022 ## Extracting Metadata from a file
0023 
0024 This requires us to create a ExtractionPluginManager class, fetch the extractor
0025 plugins which are applicable for that file, and then pass the instance of
0026 ExtractionResult to each Extractor.
0027 
0028 A simple test example called `dump.cpp` has been written.
0029 
0030 ## Writing Metadata to a file
0031 
0032 This will require calling the WriterCollection class's fetchWriters() method with the mimetype of the file that needs to be written to. This method will return a list of writers, and to actually write metadata, a call to the write() method is required. The write() method accepts an instance of the WriteData class, which stores a mapping of the properties to be written and their values.
0033 
0034 ## Writing a custom file extractor
0035 
0036 The Metadata is extracted with the help of Extraction Plugins. Each plugin
0037 provides a list of mimetypes that it supports, and implements the extraction
0038 function which extracts the data and fills it in an ExtractionResult.
0039 
0040 Most of the common file types are already provided by the library.
0041 
0042 Extractors should typically avoid implementing any logic themselves and should
0043 just be wrappers on top of existing libraries.
0044 
0045 ## Writing a custom metadata writer
0046 
0047 The writeback framework uses an approach similar to the extraction framework. Each writer plugin supports a list of mimetypes and implements the write function that takes in a WriteData object as input.
0048 
0049 ### Adding data into an `ExtractionResult`
0050 
0051 The ExtractionResult can be filled with (key, value) pairs and plain text. The
0052 keys in these pairs typically correspond to a predefinied property. The list
0053 of properties is defined in the `properties.h` header. Every plugin should
0054 use the properties defined in this header. If a required property is missing
0055 then it should be added to this framework.
0056 
0057 The ExtractionResult should also be given a list of types. These types are
0058 defined in the `types.h` header. The correspond to a higher level overview
0059 of the files which the user typically expects.
0060 
0061 ## Writing an external plugin
0062 
0063 Extractors and Writers can also be written in other languages and installed into the system,
0064 and KFileMetaData will be able to find them and use them.
0065 
0066 An external plugin must be an independently executable file (a binary,
0067 script with a hashbang line with the executable permission set, a batch file or
0068 cmd script, etc). They must be located within libexec directory.
0069 
0070 KFileMetaData will wrap each external extractor with an instance of the ExternalExtractor class, and every writer with `ExternalWriter`. The application will be free to choose any of the plugins returned by WriterCollection or ExtractorCollection.
0071 
0072 Every external plugin will be placed within a directory in libexec/kf5/kfilemetadata/externalextractors. Every plugin shall have a manifest.json file that specifies the mimetypes that the plugin supports and the main executable file. A sample manifest file is located at src/writers/externalwriters/example/manifest.json.
0073 
0074 Both kinds of plugins accept the target file as an argument.
0075 
0076 ### Writing an external extractor
0077 
0078 Extractors take JSON formatted input specifying the input mimetype, and return JSON output with the extracted properties. The JSON output also indicates any errors that might have occurred. Calls to the extractor are blocking, hence there is a time limit for how long they can run.
0079 
0080 ### Writing an external writer
0081 
0082 Writers take in the properties to be changed via stdin and return JSON output with the success value of the write operation. Calls to the writerare blocking, hence there is a time limit for how long they can run.
0083 
0084 ## Links
0085 - Mailing list: <https://mail.kde.org/mailman/listinfo/kde-devel>
0086 - IRC channel: #kde-devel on Libera Chat