Friday, September 10, 2004

Data Mining

I have setup Weka and begun working with my LDSM data set. The LDSM dataset is about 15000 records and has a number of attributes. I want to analyze it with Weka and see if I can find anything interesting. I have setup Weka on two computers, but java has run out of memory on both. I believe it is due to the amount of memory that has been allocated to Java, not how much memory each of the computers has.

http://www.cs.waikato.ac.nz/~ml/weka/tips_and_tricks.html
Attempts at running the suggested command (java -mx100000000 -oss100000000) have been unsucessful thus far.

According to the java.sun.com website the command is used as follows:
-Xmxn
Specify the maximum size, in bytes, of the memory allocation pool. This value must a multiple of 1024 greater than 2MB. Append the letter k or K to indicate kilobytes, or m or M to indicate megabytes. The default value is 64MB.
      -Xmx83886080

-Xmx81920k
-Xmx80m
I may simply need to make the number a multiple of 1024. I'll try that next...nope it didn't work.

I tried the following:
C:\>java -Xmx 80m
Invalid maximum heap size: -Xmx
Could not create the Java virtual machine.

C:\>java -Xmx83886080
Usage: java [-options] class [args...]
(to execute a class)
or java [-options] -jar jarfile [args...]
(to execute a jar file)

where options include:
-client to select the "client" VM
-server to select the "server" VM
... {MERELY PRINTED OUT USAGE INFORMATION}


No comments: