Case Settings

General Settings

The “General Settings” page allows you to tune some settings like hits per page, enable comments, datechanger, etc.

  • Use first folder as Facet or Custodian name?

    This feature is especially useful for eDiscovery investigations. Enabling this option allows you to place data items that belong to a custodian in a separate folder. INDICA will recognise this folder as a custodian and adds filtering capabilities on custodian in the facet bar.

  • Enable OCR?

    Toggling this setting will enable/disable the Optical Character Recognition processing. Disabling this setting will result in a higher indexing speed, but it will not fully index the content of scanned documents. The index will be incomplete as it will not contain the content of non-text based data items.

    Tip

    It is possible to process OCR data on a later moment.

  • Pre-create document views?

    This option allows you to pre-create document views. This will result in nicer document previews and faster loading times.

  • Only index email meta data?

    When this option is enabled, only the metadata of emails is processed. The email content will not be included in the index.

  • Use nice document viewer?

    Enabling this option allows you to preview documents in a nicer document previewer.

  • Disable this option to remove ACL from shares

    When this option is disabled, the Access Control Lists of shares will not be taken into account when indexing. This means that all INDICA users can see all indexed documents, regardless of Whether or not they can see the documents on the source.

  • Enable NLP library?

    When this option is enabled, INDICA will use it’s NLP algorithm to extract NLP keywords from documents.

  • How many hits should be shown per page?

    Number of hits per page. Can be changed to “Show None” to allow metadata investigations.

  • Enable Downloads?

    Whether or not downloading of documents from the user interface is allowed.

  • Enable Comments?

    Whether or not the “Comments” section in the previewer will be shown. This allows users to attach comments to documents.

  • Enable Stemming?

    When Stemming is enabled, the indexer will trim down verbs to their stem for indexing.

  • Enable datechanger?

    Datechanger enables the user to alter the date of a document. This is especially useful when a document is scanned in from paper. The file creation date will not necessarily reflect the original (paper) document date. The datechanger can change the document date in the index.

Display Settings

Facet Settings

Facet Settings allows you to manage and create filtering capabilities in the INDICA front-end.

This overview shows the list of currently active filters (facets) in the front-end. From here, it’s posible to:

  • Add facets

    Type in a facet key in the top bar (“Enter facet key”), then click the “Add” button. A new facet will be created. The facet keys are based on the fields in the index. For help with selecting the correct field, contact INDICA support.

  • Re-order facets

    Re-ordering facets is possible by dragging the icon with the four horizontal lines. Changing the order of the facets will be reflected on the front-end.

  • Delete facets

    Click the “X” icon on the right to remove a facet. It can always be re-added if needed.

  • Edit facets

    It is possible to change some characteristics of a facet. You can change the name of the facet, if it will be expanded by default, and if the facet should be hidden or not.

  • Reset facets to default

    When needed, the facets can be reset to the default setup by clicking the “Reset to default” button.

Changes in the facet settings will be presented to the end user as such:

Result list Settings

The following option can be changed on this page:

  • Allow user to toggle between list and table style

    When enabled, a button will appear on the search page that allows the user to switch between list-style results and table-style results. The table-style results can be used for a custom overview of query results, as the displayed information can be chosen by the user. This feature is very powerful, but less easy to use.

Example of list-style preview:

Example of table-style preview:

Detail View Settings

This panel allows you to change the following options:

  • Show Tagging?

    This option enables and disables the options to tag documents in the front-end. It is not possible to alter tags when this option is disabled.

List Style Settings

These options allow for customizing the information in the list-style results.

The following options can be changed:

  • Show date?

    Enable or disable the displaying of the date from result item.

  • Show file size?

    Enable or disable the displaying of the file size from result item.

  • Show ID?

    Enable or disable the displaying of the document ID from result item.

  • Show path?

    Enable or disable the displaying of the file path from result item.

  • Show summary?

    Enable or disable the displaying of the document summary from result item.

  • Show similar?

    Enable or disable the displaying of the “Similar Documents” button from result item.

  • Show duplicates?

    Enable or disable the displaying of the “Duplicates” button from result item.

Search Settings

The search settings allow you to manipulate the search results. This can be done by defining synonyms, stopwords, and editing the boosting settings.

Stopword List

During indexing, it is possible to exclude a list of stopwords from the index. Those words are generally words without informational value, like “a”, “and”, “this”, “the”, etc.

Words can be added by typing them in the text field and then pressing the “Add” button.

INDICA comes with a default lists of stopwords, which can be changed here as well.

Synonym List

It is also possible to define synonyms. This allows you to automatically broaden search results that contain words. Synonyms can be added in the text field and then clicking the “Add” button.

Synonyms need to be added as a comma-separated list, for example: “hello,hi,hey”.

Boosting

Boosting can be done by adding a boost query, or by adding a boost function.

Boost Query

The Boost Query specifies an additional query clause that will be added to the user’s main query to influence the score.

INDICA provides the relevance level of matching documents based on the results found. To boost a query, use the caret, “^”, symbol with a boost factor (a number) at the end of the query you are searching. The higher the boost factor, the more relevant the query will be. Boosting allows you to control the relevance of a document by boosting its query. For example, if you are searching for

jakarta apache

And you want the term “jakarta” to be more relevant, boost it using the ^ symbol along with the boost factor next to the query. You would type:

jakarta^4 apache

This will make documents with the term jakarta appear more relevant. You can also boost Phrase Terms as in the example:

“jakarta apache”^4 “Indica search”

By default, the boost factor for each term or phrase is 1. Although the boost factor must be positive, it can be less than 1 (e.g. 0.2).

Boost Function

A Boost Function can also be added here. This feature is for advanced users. Please refer to the official documentation for help.

API Settings

INDICA supports two types of API’s. The polling API and the high level API.

Polling API

The INDICA Polling API service for external systems is a secure API that enables external systems to query the INDICA index in a limited way. It is designed for asynchronous communication initiated by the INDICA appliance (polling for instructions). The API is only able to transfer meta information, no actual content of document object can ever be transmitted to the external system asking for information.

The external system creates a queue of queries according to its needs in its own external API end point (the command set) to be run against the INDICA index. The INDICA pull API then returns the results back to the external systems designated API endpoint. Query results can return metadata from the index with addition to Privacy issues and for authorized users links to preview in the INDICA system itself.

This API is designed for on-premise INDICA systems and Internet-facing applications.

Setting up your Application

For the API to work, your application requires a very basic job system. A single Basic Auth or preferably OAuth protected endpoint(GET) that returns a query of commands/queries for INDICA to process. And a single protected endpoint(POST) which receives the results from INDICA.

Endpoint paths are configurable and you can create it according to your system limits/needs, but it should reside on two of the endpoints (not more) and explicitly POST endpoint has to match the specified pattern.

Required Endpoints

Type

Path

Ext. App. JSON structure

Description

GET

/api/indica/jobs

Job GET Endpoint JSON

Response body has to match one of the job type structures

POST

/api/indica/jobs/{job_id}

Job Result JSON

These will be result that you will receive from INDICA once specific job type is done.

INDICA also accepts more job types than the ones shown in the examples above. API endpoints stay the same on your system, but required request end response JSON structures change based on every type.

Job Types

Type

Job Queue JSON

Response JSON

Description

document_query

Document Query JSON

Document Query Response JSON

Document query returns hit counts and meta fields you requested

subject_details

Subject Details Request JSON

Document Query Response JSON

Returns the same values as Document Query. The only difference is that the query is generated by INDICA

document_export

Export Documents JSON

Document Export Response JSON

Tag export prepares a package on specified tags and uploads it to a specified vendor with specified security measures

privacy_query

Privacy Query JSON

Privacy Query Response JSON

Returns Privacy issues on all or specific assets.

Batched result responses

All responses can be batched if the job request includes “batch” : 1 and batch_size fields in the JSON. batch_size has to be an integer of what is the maximum character count in the resulting JSON will be. Batched results have one extra JSON key (“batch_id”) and an extra status - “inprogress”. In which “batch_id” represents the order in a batch and status tells if the batch is still “inprogress” or is “processed”.

Batch Response JSON

Setting up a polling job on the INDICA appliance

For job polling system to work correctly it is required to follow strict guidelines.

Settings configuration currently resides in Settings > API Settings inside the case management.

To create a new polling job, press the “Add new API polling job” button.

Polling jobs will run according to your settings:

  • Result data type

    Can be changed based on what data you want to be returned to the external source. Currently, there is “Document query” - which returns document fields that you select further down in “Return fields”, and there is also a choice to return “Privacy issues”. Structure of returned data is described below;

  • Run query in user scope

    Option specifies if the query should include only the data which adheres to a specific user’s rights (could be API user with your setup in your AD);

  • User

    Setting of a user who’s scope will be used;

  • Time between polling

    Is used to specify how often the system will check if there are new jobs in the GET endpoint, and run them. Running jobs too often might degrade the performance of your system;

  • Endpoint Authentication type

    Specifies what type of authentication external endpoints use;

  • Job Queue GET endpoint

    Specifies endpoint from which jobs list in a JSON format will be retrieved;

  • Result POST endpoint

    Specifies endpoint to which the queried Document field(s) or Privacy issues will be sent;

  • Require acknowledge

    Specifies if acknowledgment of received jobs should be given. It is strongly recommended to use acknowledgment since it negates duplicate jobs when polling time is shorter, or queries are more difficult. It guarantees that the same job doesn’t get picked up twice or more;

  • Job acknowledgment endpoint

    Specifies an endpoint to which acknowledgment will be done.

High level API

Polling API Examples

Job GET Endpoint JSON

 1[
 2    {
 3      "job_id": "47ec1fe7db822300e5c4e1bb4b961972",
 4      "job_type": "document_query",
 5      "q": "\"David Vikander \"~0",
 6      "fq": "",
 7      "return_fields": [
 8        "file_name",
 9        "mime_type",
10        "size"
11      ],
12      "status": "processed"
13    },
14    {
15      "job_id": "47ec1fe7db822300e5c4e1bb4b961972",
16      "job_type": "document_query",
17      "q": "\"Dead Kennedys\"~0 OR \"test\"~0",
18      "fq": "",
19      "return_fields": [
20        "file_name",
21        "mime_type"
22      ],
23      "status": "processed"
24    }
25]

Job Result JSON

 1{
 2    "job_id": "47ec1fe7db822300e5c4e1bb4b961972",
 3    "job_type": "document_query",
 4    "result_url": "http://192.168.2.128/search?q=%22David%20Vikander%22~0&page=1&fq={%22privacy_Combined%22:[%22NAME%22]}&sort=score%20desc&mlt=0",
 5    "result_collection": {
 6      "count": 2,
 7      "entries": [
 8        {
 9          "file_name": "John_Doe.doc",
10          "mime_type": "doc"
11        },
12        {
13          "file_name": "John Doe.pst",
14          "mime_type": "pst"
15        }
16      ]
17    },
18    "status": "processed",
19    "error_message": "error"
20}

Document Query JSON

 1[
 2    {
 3      "job_id": "47ec1fe7db822300e5c4e1bb4b961972",
 4      "job_type": "document_query",
 5      "q": "\"David Vikander \"~0",
 6      "fq": [
 7        {
 8          "tags": [
 9            "2_todelete"
10          ]
11        }
12      ],
13      "return_fields": [
14        "file_name",
15        "mime_type",
16        "size"
17      ],
18      "exporting": true,
19      "status": "processed"
20    },
21    {
22      "job_id": "47ec1fe7db822300e5c4e1bb4b961972",
23      "job_type": "document_query",
24      "q": "\"Dead Kennedys\"~0 OR \"test\"~0",
25      "fq": "",
26      "return_fields": [
27        "file_name",
28        "mime_type"
29      ],
30      "exporting": true,
31      "status": "pending"
32    }
33]

Document Query Response JSON

 1{
 2    "job_id": "47ec1fe7db822300e5c4e1bb4b961972",
 3    "job_type": "document_query",
 4    "result_url": "http://192.168.2.128/search/SCA0001052?q=a",
 5    "result_collection_urls": [
 6      {
 7        "name": "TestCollections",
 8        "url": "http://192.168.2.128/search/SCA0001052?module=TestCollections&q=a"
 9      },
10      {
11        "name": "Sites",
12        "url": "http://192.168.2.128/search/SCA0001052?module=Sites&q=a"
13      }
14    ],
15    "result_collection": {
16      "count": 2,
17      "entries": [
18        {
19          "file_name": "John_Doe.doc",
20          "mime_type": "doc"
21        },
22        {
23          "file_name": "John Doe.pst",
24          "mime_type": "pst"
25        }
26      ]
27    },
28    "tags": [
29      {
30        "title": "To Delete",
31        "value": "2_todelete"
32      },
33      {
34        "title": "Unresolved",
35        "value": "1_unresolved"
36      }
37    ],
38    "status": "processed",
39    "error_message": "error"
40}

Subject Details Request JSON

 1{
 2    "job_id": "SCA0001017",
 3    "job_type": "subject_details",
 4    "subject_details": {
 5      "name": "John Doe",
 6      "tel": "88123456789",
 7      "email": "jonh@example.com",
 8      "address": "Elm Street 13"
 9    },
10    "status": "pending"
11}

Export Documents JSON

 1[
 2    {
 3      "job_id": "47ec1fe7db822300e5c4e1bb4b961972",
 4      "job_type": "document_export",
 5      "packaging": 1,
 6      "export_vendor": "box",
 7      "query_job_id": "56rsbs0e5css51972",
 8      "protection": "pass",
 9      "subject_email": "test@example.com",
10      "status": "processed"
11    },
12    {
13      "job_id": "8e6d5fe7db822300e5c4e1bb4b961947",
14      "job_type": "document_export",
15      "packaging": 1,
16      "export_vendor": "box",
17      "query_job_id": "56rsbs0e5css51972",
18      "protection": "pass",
19      "subject_email": "test@example.com",
20      "status": "processed"
21    }
22]

Document Export Response JSON

1{
2    "job_id": "47ec1fe7db822300e5c4e1bb4b961972",
3    "job_type": "document_export",
4    "package_url": "https://example.com",
5    "status": "processed",
6    "error_message": "error"
7}

Privacy Query JSON

1[
2    {
3      "job_id": "e6e31e4ddb032300e5c4e1bb4b9619fa",
4      "job_type": "privacy_query",
5      "q": "",
6      "fq": "",
7      "status": "processed"
8    }
9]

Privacy Query Response JSON

 1{
 2    "job_id": 64,
 3    "job_type": "privacy_query",
 4    "result_collection": {
 5      "count": 1,
 6      "entries": [
 7        {
 8          "asset": {
 9            "type": "collection",
10            "identifiers": {
11              "name": "Unstructured Data",
12              "ip": "198.168.1.0",
13              "path": "/dropbox/JohnDoe/things"
14            }
15          },
16          "issues": [
17            {
18              "type": "NAME",
19              "count": 227
20            },
21            {
22              "type": "EMAIL",
23              "count": 53
24            },
25            {
26              "type": "TEL",
27              "count": 11
28            },
29            {
30              "type": "CC",
31              "count": 4
32            }
33          ]
34        }
35      ]
36    },
37    "status": "processed",
38    "error_message": "error"
39}

Batch Response JSON

Response 1

 1[
 2    {
 3      "job_id": "47ec1fe7db822300e5c4e1bb4b961972",
 4      "job_type": "document_query",
 5      "batch_id": 0,
 6      "result_url": "indica.lan/query=JonhDoe",
 7      "result_collection": {
 8        "count": 2,
 9        "entries": [
10          {
11            "file_name": "John_Doe.doc",
12            "mime_type": "doc"
13          },
14          {
15            "file_name": "John Doe.pst",
16            "mime_type": "pst"
17          }
18        ]
19      },
20      "status": "inprogress"
21    }
22]

Response 2

 1[
 2    {
 3      "job_id": "47ec1fe7db822300e5c4e1bb4b961972",
 4      "job_type": "document_query",
 5      "batch_id": 1,
 6      "result_url": "indica.lan/query=JonhDoe",
 7      "result_collection": {
 8        "count": 2,
 9        "entries": [
10          {
11            "file_name": "test.pst",
12            "mime_type": "email"
13          },
14          {
15            "file_name": "test.txt",
16            "mime_type": "text"
17          }
18        ]
19      },
20      "status": "processed"
21    }
22]