#phpGlluchMiriadaX
phpGlluchMiriadaX is a collection of scripts to obtain the metadata from a group of MiriadaX courses.
Tested in novembre 2015
Freeling as server for POS tagging (optional). Change normalize.php and put your IP and port. Save one by one the html files you are interesed in and save in courses dir.
This files has to be executed in php CLI in this order:
- php descriptions.php Extracts the information from each course. The results of this steps are in json0 dir
- php normalize.php Searches the words that are not stopwords and saves with its POS tag.
- php clean.php From the previos step, deletes the puntuation marks.
The courses information will be in the directory json2 (with POS tags) or json0 without them.
phpGlluchCoursera
phpGlluchCourseTalk
phpGlluchEdX