You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+60Lines changed: 60 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,3 +6,63 @@ This package contains a set of distributed text modeling algorithms implemented
6
6
-**Gibbs sampling LDA**: the implementation is adapted from Spark PRs(#1405 and #4807) and JIRA SPARK-5556 (https://github.com/witgo/spark/tree/lda_Gibbs, https://github.com/EntilZha/spark/tree/LDA-Refactor, https://github.com/witgo/zen/tree/lda_opt/ml, etc.), with several extensions (e.g., support for MLlib interface, predict and in-place state update) added
7
7
8
8
-**Online HDP (hierarchical Dirichlet process)**: implemented based on the paper "Online Variational Inference for the Hierarchical Dirichlet Process" (Chong Wang, John Paisley and David M. Blei)
9
+
10
+
-**Notes from Stephen Boesch December 2017**
11
+
12
+
13
+
This Repo lacked working code for the HDP. I added an ```OnlineHDPExample``` program. In addition the dependencies were udpated to Spark 2.2 and Scala 2.11 and latest Breeze (linear algebra library).
17/12/21 23:52:28 INFO AbstractConnector: Stopped Spark@5b3c8e38{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
26
+
17/12/21 23:52:28 WARN FileSystem: exception in the cleaner thread but it will continue to run
27
+
java.lang.InterruptedException
28
+
at java.lang.Object.wait(Native Method)
29
+
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
30
+
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164)
31
+
at org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner.run(FileSystem.java:2989)
32
+
at java.lang.Thread.run(Thread.java:748)
33
+
[WARNING] thread Thread[org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner,5,org.apache.spark.mllib.topicModeling.OnlineHDPExample] was interrupted but is still alive after waiting at least 12891msecs
34
+
[WARNING] thread Thread[org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner,5,org.apache.spark.mllib.topicModeling.OnlineHDPExample] will linger despite being asked to die via interruption
35
+
[WARNING] NOTE: 1 thread(s) did not finish despite being asked to via interruption. This is not a problem with exec:java, it is a problem with the running code. Although not serious, it should be remedied.
0 commit comments