This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
multi-scale [2018/03/23 10:35]
multi-scale [2018/03/23 10:35] (current)
Line 43: Line 43:
 would be useful for practitioners intending to tune similar would be useful for practitioners intending to tune similar
 models on new datasets. models on new datasets.
 +We analyze the relative importance of the hyperparameters
 +defining the model using a Random Forest approach for
 +the word-level task on the smaller WikiText-2 data set for AWDQRNN
 +model. The results show that weight dropout, hidden
 +dropout and embedding dropout impact performance the most
 +while the number of layers and the embedding and hidden dimension
 +sizes matters relatively less. Similar results are observed on
 +the PTB word level data set.