Skip to content

Commit 990668d

Browse files
committed
Merge branch 'main' into develop
zsh:1: command not found: wq
2 parents 7d51e44 + 5c4722f commit 990668d

File tree

4 files changed

+36
-7
lines changed

4 files changed

+36
-7
lines changed

.github/workflows/publish.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,15 @@ on:
44
push:
55
branches:
66
- main
7+
tags:
8+
- 'v*'
79

810
jobs:
911
test:
1012
runs-on: ubuntu-latest
1113
strategy:
1214
matrix:
13-
python-version: ["3.10", "3.11", "3.12"]
15+
python-version: ["3.10"]
1416

1517
steps:
1618
- uses: actions/checkout@v4

.github/workflows/python-package.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,9 @@ name: Python package
55

66
on:
77
push:
8-
branches: [ "main" ]
8+
branches: [ "main", "develop" ]
99
pull_request:
10-
branches: [ "main" ]
10+
branches: [ "main", "develop" ]
1111

1212
jobs:
1313
build:

README.md

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,15 @@
1-
# `SqueakyCleanText`
1+
<div align="center">
22

3-
[![PyPI](https://img.shields.io/pypi/v/squeakycleantext.svg)](https://pypi.org/project/squeakycleantext/) [![PyPI - Downloads](https://img.shields.io/pypi/dm/squeakycleantext)](https://pypistats.org/packages/squeakycleantext)
3+
# SqueakyCleanText
4+
5+
[![PyPI](https://img.shields.io/pypi/v/squeakycleantext.svg)](https://pypi.org/project/squeakycleantext/)
6+
[![PyPI - Downloads](https://img.shields.io/pypi/dm/squeakycleantext)](https://pypistats.org/packages/squeakycleantext)
7+
[![Python package](https://github.com/rhnfzl/SqueakyCleanText/actions/workflows/python-package.yml/badge.svg)](https://github.com/rhnfzl/SqueakyCleanText/actions/workflows/python-package.yml)
8+
[![Python Versions](https://img.shields.io/badge/Python-3.10%20|%203.11%20|%203.12-blue)](https://pypi.org/project/squeakycleantext/)
9+
[![License](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)
10+
11+
A comprehensive text cleaning and preprocessing pipeline for machine learning and NLP tasks.
12+
</div>
413

514
In the world of machine learning and natural language processing, clean and well-structured text data is crucial for building effective downstream models and managing token limits in language models.
615

tests/test_sct.py

Lines changed: 20 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -60,17 +60,25 @@ class TextCleanerTest(unittest.TestCase):
6060
def setUpClass(cls):
6161
if os.getenv('GITHUB_ACTIONS'):
6262
cls.ner = None
63+
# Initialize empty processing classes for GitHub Actions
64+
cls.ProcessContacts = None
65+
cls.ProcessDateTime = None
66+
cls.ProcessSpecialSymbols = None
67+
cls.NormaliseText = None
68+
cls.ProcessStopwords = None
69+
cls.fake = None
6370
return
6471

6572
try:
6673
with timeout(1200): # 20 minute timeout
6774
config.CHECK_NER_PROCESS = False
75+
# Initialize all the processing classes
6876
cls.ProcessContacts = contact.ProcessContacts()
6977
cls.ProcessDateTime = datetime.ProcessDateTime()
7078
cls.ProcessSpecialSymbols = special.ProcessSpecialSymbols()
7179
cls.NormaliseText = normtext.NormaliseText()
7280
cls.ProcessStopwords = stopwords.ProcessStopwords()
73-
cls.fake = Faker()
81+
cls.fake = Faker() # Initialize Faker
7482

7583
# Override default models with smaller model for testing
7684
test_models = ["dslim/bert-base-NER"] * 5 # Same small model for all languages
@@ -99,8 +107,18 @@ def setUpClass(cls):
99107
raise
100108

101109
def setUp(self):
110+
"""Set up test fixtures before each test method."""
102111
config.CHECK_NER_PROCESS = True
103-
# Use the class-level NER instance instead of creating a new one
112+
if os.getenv('GITHUB_ACTIONS'):
113+
self.skipTest("Skipping test in GitHub Actions")
114+
115+
# Copy class-level attributes to instance level
116+
self.ProcessContacts = self.__class__.ProcessContacts
117+
self.ProcessDateTime = self.__class__.ProcessDateTime
118+
self.ProcessSpecialSymbols = self.__class__.ProcessSpecialSymbols
119+
self.NormaliseText = self.__class__.NormaliseText
120+
self.ProcessStopwords = self.__class__.ProcessStopwords
121+
self.fake = self.__class__.fake
104122
self.ner = self.__class__.ner
105123

106124
@settings(deadline=None)

0 commit comments

Comments
 (0)