Resources

41 thoughts on “Resources

  1. As I am not a registered UdeM student, I don’t think I can use the wifi there. Will I need it?

    Even if I don’t necessarily need it, is there a way that I can access it anyway? (I like to look things up sometimes in class).

    • It will be helpful, but not strictly necessary, to follow along during lecture tomorrow, but for subsequent lectures it may be more convenient if you did have Internet access. Signing up for eduroam through your home institution should allow you to access the eduroam wireless at UdeM.

  2. In case anyone else is a fan of iPython and iPython Notebook, I started to reproduce the Theano tutorials in a collection of notebooks.
    https://www.dropbox.com/sh/d663pavvbydkroc/NuDc_KoWNs

    I reproduced them as closely as possible to the tutorial, with some minor modifications here and there.

    It’s a bit time consuming so I can’t guaranty I’ll do the whole tutorial, but I’ll keep going as long as its reasonable for me to do it. Hope this helps.

    P.S.: I started the notebook server with
    ipython notebook –pylab inline

  3. Where should I go to get access to the LISA lab? I tried ssh’ing to both elisa1.iro.umontreal.ca and frontal07.iro.umontreal.ca with my DGIT credentials but they’re denied.
    It was not possible for me to go the the seminar on LISA last week. Thanks

  4. When you have access to the DIRO computers, you will probably want at
    some point use our cluster. Plan some times to learn how to use it.
    The instruction are here:

    http://www.iro.umontreal.ca/~lisa/twiki/bin/view.cgi/Public/BramsUserGuide

    You can use the space in that directory: /data/lisatmp/ift6266h14

    Verify that you have access to it. Create a subfolder with your login
    and put your personal files in their:

    mkdir /data/lisatmp/ift6266h14/$USER

  5. I want to explore the data with ipython on elisa1 but find myself struggling with speed issues (limited by RAM most likely). Should I make a copy of the data over, say, bart1 and play with it there?

    • Doing anything too CPU intensive on elisa1 is a good way to get yelled at by the admins. Using the instructions posted by Fred above, you can launch interactive jobs with jobdispatch –interactive . This will work just fine with an IPython terminal session but things may get dicey with X forwarding and plotting, so I recommend using the notebook.

      If you want to run a notebook server this way, you can set it up to accept connections from any “ipython notebook –port= –ip=*”. What you would probably want to do is launch that jobdispatch in a screen session, detach and exit, then ssh -L [local port]:[host where condor job is running]:[remote ipython port] [username]@elisa1.iro.umontreal.ca (then ssh to maggie46 or wherever you launched the job from and reattach to keep the log output visible).

      • I’m struggling a bit here. From maggie46 (in my /data/lisatmp/ift6266h14 directory) I run:
        jobdispatch “ipython notebook –port=8765 –-ip=*”

        This does seem to start a job, but from looking at the log file that jobdispatch tells me to look at, it seems to terminate immediately.

  6. Hi, during the tutoral of pylearn2 I got:

    C:\Users\Benj\Anaconda\lib\site-packages\pylearn2-0.1dev-py2.7.egg\pylearn2\datasets\preprocessing.py:843: UserWarning: This ZCA preprocessor class is known to yield very different results on different platforms. If you plan to conduct experiments with this preprocessing on multiple machines, it is probably a good idea to do the preprocessing on a single machine and copy the preprocessed datasets to the others, rather than preprocessing the data independently in each location.
    warnings.warn(“This ZCA preprocessor class is known to yield very ”

    Why please?

    • It looks like you’re just getting one of the warnings printed out. Pylearn2 prints many warnings to make users aware of things that do not actually cause an error, but may affect their results.

    • Hi David,

      there is no official support, but I wrote some hackish classes that may be useful for the project. All you need is to make a subclass of DenseDesignMatrix which populates at least the attributes X and y, X being a numpy NxM array (N examples, M features) and y an NxC array (N examples, C outputs).

      I will push my code to GitHub sometime this weekend so you’ll be able to use it as a reference.

      • I cleaned up the code and pushed it to GitHub. You’ll need to install NLTK to use it as one of the classes is just an extension of NLTK’s TIMIT class. The classes in timit_dataset.py are TimitFrameData (for the “predict next sample” problem) and TimitPhoneData (for the “phone classification” problem). I hope somebody else finds them useful.

      • Joao, I’m trying to use the TimtFullCorpusReader from your github. I can’t get it to load the data I point it to timit/raw/TIMIT/TRAIN/DR1 but nothing loads, I think because some regex:s aren’t matching. Do you have any tips on getting it to load the data with that class?

      • Hi David,

        sorry for the lack of documentation. You have to point it to the absolute path of the root folder (which would be /timit/raw/TIMIT). The class also needs all the {PHN,TXT,WRD) files and the DOC folder to be under the root. You can get that by copying all the readable .wav files over the raw ones. Let me know if that works for you.

      • Could you maybe simply upload your modified timit/raw/TIMIT directory that can be read with that class to a globally readable folder on the iro network?

  7. Are there any computers with GPUs at LISA that we have permission to use for the project?

    (I’m doing the pylearn2 MLP tutorial so that I can use it as a model for a first attempt at synthesis. It recommends running the code on a computer with GPU. I’m expecting the synthesis model will be bigger than the MNIST model from the tutorial.)

    • You’ll have to clone your own copy of both Theano and pylearn2. These projects move fast enough that at the lab we leave it up to the users to manage them. This way, unexpected updates don’t cause surprises mid-project (the individual user/student chooses when you update Theano or pylearn2).

    • I can’t seem to reply to your reply, but as far as jobdispatch and IPython you need to use jobdispatch –interactive.

  8. I cannot run Theano-based jobs in the cluster. Whenever I try it, it fails and I get this message in the error log:

    File “/data/lisatmp/ift6266h14/santosjf/lib/python2.7/site-packages/Theano-0.6.0-py2.7.egg/theano/gof/cmodule.py”, line 1980, in compile_str
    (status, compile_stderr.replace(‘\n’, ‘. ‘)))
    Exception: Compilation failed (return status=1): g++: error trying to exec ‘cc1plus’: execvp: No such file or directory.

    I installed Theano to a local folder (in /data/lisatmp/ift6266h14/) and set up PYTHONPATH accordingly (I am also passing PYTHONPATH to jobdispatch via the –env parameter). Theano seems to work properly when I run it from maggie46. What may be causing this problem?

    • I am having a different compilation problem when on brams. If I run the pylearn2 train.py script on the MLP tutorial yaml file I get:

      Problem occurred during compilation with the command line below:
      g++ [snip] -lamdlibm
      /tmp/belius/theano.NOBACKUP/compiledir_Linux-2.6.35.14-106.fc14.x86_64-x86_64-with-fedora-14-Laughlin-x86_64-2.7.0-64/tmpoHsr5V/mod.cpp:6:21: Fatal errorl: amdlibm.h: File does not exist.

      Om maggie46 this doesn’t happen, the train.py script runs fine.

      • Maybe the reason it doesn’t happen on maggie46 is that on that machine amdlibm.h exists under /opt/lisa/os/include/, while on the brams cluster machine I’m assigned by jobdispatch it does not.

      • As far as home directories, yes, I believe this is to make sure that the home directory server is not brought down by cluster jobs hammering it.

      • For reference, I asked Fred about this problem, and the reason it wasn’t working was that I hadn’t run
        “if [ -e “/opt/lisa/os/.local.bashrc” ];then source /opt/lisa/os/.local.bashrc; else source /data/lisa/data/local_export/.local.bashrc; fi”
        to set the theano configuation on my cluster interactive instance.

        This hadn’t run, in turn, because I didn’t have access to my home directory (the idea is to put the above in your ~/..bashrc file). To make home directory work on the cluster you may have to run the kinit command (and the “source ~/..bashrc” to configure theano, if you put the above shell code in that file).

  9. Could someone confirm that /data is empty on elisa and maggie46? I dispatched a job last night on maggie, my script being under /data/lisatmp/ift6266h14/trembal, but now /data is empty on every server I checked. As if everything was deleted.

  10. When using the Condor cluster to run my tasks, I am not able to run tasks on GPU using Theano. I switched the configuration to CPU but now all my simulation returns a MemoryError every time I try to run it. The same simulation runs perfectly on a computer with 8 GB of RAM. Is it possible that the task is being scheduled to a computer with less RAM and that’s why this is happening?