Skip to content

logfile is not showing any runs for the test set. The plots also don't show anything for test set and accuracy. #4

@saharudra

Description

@saharudra

When I run the code, I get the following output:

(rn_env) exx@ubuntu:/data/Rudra/RelationNetworks-CLEVR$ python                          
Python 3.6.6 (default, Jun 28 2018, 00:00:00)                                         
[GCC 4.8.4] on linux                                             
Type "help", "copyright", "credits" or "license" for more information.                 
>>> import torch                                                   
>>> exit()                                                                     
(rn_env) exx@ubuntu:/data/Rudra/RelationNetworks-CLEVR$ pyton -m train --clevr-dir /data/DATASETS/CLEVR_v1.0/ --model 'original-fp' | tee logfile.log
No command 'pyton' found, did you mean:                           
 Command 'python' from package 'python-minimal' (main)                                                                                                                                                             
 Command 'pytone' from package 'pytone' (universe)                    
pyton: command not found                                           
(rn_env) exx@ubuntu:/data/Rudra/RelationNetworks-CLEVR$ python -m train --clevr-dir /data/DATASETS/CLEVR_v1.0/ --model 'original-fp' | tee logfile.log                                                             
TRAIN:   0%|                                                                                                                                                                               | 0/350 [00:00<?, ?it/sL
oaded hyperparameters from configuration config.json, model: original-fp: {'state_description': False, 'g_layers': [256, 256, 256, 256], 'question_injection_position': 0, 'f_fc1': 256, 'f_fc2': 256, 'dropout': 0
.5, 'lstm_hidden': 128, 'lstm_word_emb': 32, 'rl_in_size': 52}                                                                                         
Building word dictionaries from all the words in the dataset...                                   
==> using cached dictionaries: /data/DATASETS/CLEVR_v1.0/questions/CLEVR_built_dictionaries.pkl
Word dictionary completed!                                                                                                                                                                                         
Initializing CLEVR dataset...
==> using cached questions: /data/DATASETS/CLEVR_v1.0/questions/CLEVR_train_questions.pkl
==> using cached questions: /data/DATASETS/CLEVR_v1.0/questions/CLEVR_val_questions.pkl
CLEVR dataset initialized!
Supposing original DeepMind model
Training (350 epochs) is starting...
Dataset reinitialized with batch size 640
Current learning rate: 1e-05
                                                                                                                                                                                                                  T
raceback (most recent call last):███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊| 1093/1094 [11:21:28<00:37, 37.41s/it, loss=1.92]
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/data/Rudra/RelationNetworks-CLEVR/train.py", line 418, in <module>
    main(args)
  File "/data/Rudra/RelationNetworks-CLEVR/train.py", line 356, in main
    train(clevr_train_loader, model, optimizer, epoch, args)
  File "/data/Rudra/RelationNetworks-CLEVR/train.py", line 40, in train
    output = model(img, qst)
  File "/data/Rudra/virtualenvs/rn_env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in __call__
    result = self.forward(*input, **kwargs)
  File "/data/Rudra/RelationNetworks-CLEVR/model.py", line 200, in forward
    x = torch.cat([x, self.coord_tensor], 1)    # (B x 24+2 x 8*8)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 469 and 640 in dimension 0 at /pytorch/torch/lib/TH/generic/THTensorMath.c:2897
Train Epoch: 1 [0/700160 (0%)] Train loss: 39.945804595947266
Train Epoch: 1 [6400/700160 (1%)] Train loss: 36.57775611877442
Train Epoch: 1 [12800/700160 (2%)] Train loss: 29.848896408081053
Train Epoch: 1 [19200/700160 (3%)] Train loss: 24.984291648864748
Train Epoch: 1 [25600/700160 (4%)] Train loss: 20.945134353637695
.
.
.
Train Epoch: 1 [684800/700160 (98%)] Train loss: 1.8508247494697572
Train Epoch: 1 [691200/700160 (99%)] Train loss: 1.8768051743507386
Train Epoch: 1 [697600/700160 (100%)] Train loss: 1.8581566572189332

(rn_env) exx@ubuntu:/data/Rudra/RelationNetworks-CLEVR$ 

I have also attached my logfile with this. When I run the plot function, I get empty plots for everything apart from training loss. Please let me know where the issue might be. Thanks.

logfile.log

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions