AMMICO - AI-based Media and Misinformation Content Analysis Tool

This package extracts data from images such as social media posts that contain an image part and a text part. The analysis can generate a very large number of features, depending on the user input. See our paper for a more in-depth description.

This project is currently under development!

Use pre-processed image files such as social media posts with comments and process to collect information:

Text extraction from the images
1. Language detection
2. Translation into English or other languages
Content extraction from the images
1. Textual summary of the image content ("image caption")
2. Feature extraction from the images: User inputs query and images are matched to that query (both text and image query)
3. Question answering about image content
Content extraction from videos
1. Textual summary of the video content
2. Question answering about video content
Color analysis
1. Analyse hue and percentage of color on image
Multimodal analysis
1. Find best matches for image content or image similarity
Cropping images to remove comments from posts

Installation

The AMMICO package can be installed using pip:

pip install ammico

Or install the development version from GitHub (currently recommended for the new features):

pip install git+https://github.com/ssciwr/AMMICO.git

This will install the package and its dependencies locally.

Usage

Demonstration notebooks can be found in the docs/tutorials folder and also on google colab:

Features

Text extraction

The text is extracted from the images using google-cloud-vision. For this, you need an API key. Set up your google account following the instructions on the google Vision AI website or as described here. You then need to export the location of the API key as an environment variable:

export GOOGLE_APPLICATION_CREDENTIALS="location of your .json"

The extracted text is then stored under the text key (column when exporting a csv).

Googletrans is used to recognize the language automatically and translate into English. The text language and translated text is then stored under the text_language and text_english key (column when exporting a csv).

Content extraction from images and videos

The image and video content ("caption") is extracted using QWEN 2.5 Vision-Language model family. Qwen2.5-VL is a multimodal large language model capable of understanding and generating content from both images and videos. With its help, ammico supports tasks such as image/video summarization and image/video visual question answering, where the model answers users' questions about the context of a media file.

The audio transcription, language detection and translation is carried out using the WhisperX model family for audio transcription as developed by OpenAI.

Color/hue detection

Color detection is carried out using colorgram.py and colour for the distance metric. The colors can be classified into the main named colors/hues in the English language, that are red, green, blue, yellow, cyan, orange, purple, pink, brown, grey, white, black.

Contributing

We welcome contributions to the ammico project! If you'd like to help improve the tool, add new features, or report or fix bugs, please follow these guidelines.

Reporting Issues

Please use the issues tab to report bugs, request features, or start discussions.

License

ammico is licensed under the MIT license.

Citing ammico

Ammico has been published in Comp. Comm. Res., please cite the paper as specified in the Citations file.

Name		Name	Last commit message	Last commit date
Latest commit History 343 Commits
.github		.github
ammico		ammico
docs		docs
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
CITATION.cff		CITATION.cff
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
FAQ.md		FAQ.md
LICENSE		LICENSE
README.md		README.md
TESTING_WITH_MOCKS.md		TESTING_WITH_MOCKS.md
environment.yml		environment.yml
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AMMICO - AI-based Media and Misinformation Content Analysis Tool

Installation

Usage

Features

Text extraction

Content extraction from images and videos

Color/hue detection

Contributing

Reporting Issues

License

Citing ammico

About

Uh oh!

Releases 7

Packages

Uh oh!

Contributors 13

Uh oh!

Languages

License

ssciwr/AMMICO

Folders and files

Latest commit

History

Repository files navigation

AMMICO - AI-based Media and Misinformation Content Analysis Tool

Installation

Usage

Features

Text extraction

Content extraction from images and videos

Color/hue detection

Contributing

Reporting Issues

License

Citing ammico

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 7

Packages 0

Uh oh!

Contributors 13

Uh oh!

Languages

Packages