Colab Jupyter Notebook Tips
Published:
=======================
Colab Jupyter Notebook Tips
Magics
Manage environment variables
%env NUM_THREAD# Measure entire block
%%time
# Measure a single line
%timeit func()
Get details
%timeit?
list?
Measure execution time
# Measure entire block
%%time
# Measure a single line
%timeit func()
Autoreload
If you edit the code of an imported module or package, you usually need to restart the notebook kernel or use reload() on the specific module. That can be quite annoying. With the autoreload magic command, modules are automatically reloaded before any of their code is executed.
%load_ext autoreload
%autoreload 2
Execute other python/notebook
# Excute code in demo.ipynb (can be .py)
%run ./demo.ipynb
# Load content in demo.py to the cell
%load ./demo.py
Store data between notebooks
data = ‘this is the string I want to pass to different notebook’
%store data
del data
# Read `data` in another notebook
%store -r data # variable name must be the same
Write cell code to python file
%%writefile pythoncode.py
import numpy
def append_if_not_exists(arr, x):
if x not in arr:
arr.append(x)
def some_useless_slow_function():
arr = list()
for i in range(10000):
x = numpy.random.randint(0, 10000)
append_if_not_exists(arr, x)
Display file content
%pycat pythoncode.py
Profiling
# Profile run time
%prun func()
# Profile run time by line
# !pip install line_profiler
%load_ext line_profiler
%lprun func()
# Profile memory
# !pip install memory_profiler
%load_ext memory_profiler
%mprun func()
# Measure the memory use of a single statement
%memit func()
Use other kernels for the cell
%%python2
%%python3
%%ruby
%%perl
%%bash
%%R
# Tools
Write fast code using Cython
!pip install cython
%load_ext Cython
%%cython
def myltiply_by_2(float x):
return 2.0 * x
myltiply_by_2(23.) # 46.0
Install Jupyter Contrib Extensions
pip install https://github.com/ipython-contrib/jupyter_contrib_nbextensions/tarball/master
pip install jupyter_nbextensions_configurator
jupyter contrib nbextension install –user
jupyter nbextensions_configurator enable –user
RISE: jupyter notebook presentation
conda install -c damianavila82 rise # recommended
pip install RISE # less recommended
Display media
from IPython.display import display, Image
display(Image(‘demo.jpg’))
Connect Github to Binder
to display notebooks to others
Upload / Download Files
If you want to upload / download files via browser, you can use Colab’s python API.
from google.colab import files
files.download(‘file_path’)
files.upload() # will display an upload button
Use Google Drive as Data Storage
Every time you open a notebook, Google starts a new docker container, so everything will be lost once the notebook tab is closed. As a result, you want to retain your training results. The good news is you can mount your Google drive as an external disk onto Colab container! Even better, you can upload dataset to Google Drive and use it directly from Colab after mounting it.
To mount Google Drive, just execute following script in Colab notebook:
(This section will ask for access permission to Google Drive.)
# Setup authentication to mount Google Drive
!apt-get install -y -qq software-properties-common python-software-properties module-init-tools
!add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/null
!apt-get update -qq 2>&1 > /dev/null
!apt-get -y install -qq google-drive-ocamlfuse fuse
from google.colab import auth
auth.authenticate_user()
from oauth2client.client import GoogleCredentials
creds = GoogleCredentials.get_application_default()
import getpass
!google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret} &1 | grep URL
vcode = getpass.getpass()
!echo {vcode} | google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret}
After gaining permission, you can mount Google Drive with following script:
# Execute when running on colab
# Mount Google Drive
!mkdir -p drive
!google-drive-ocamlfuse drive
import os
os.chdir(‘drive/Colab Notebooks’)
Now your working directory is the Colab Notebooks directory in your Google Drive
Tensorboard
Run tensorboard with – tensorboard –logdir path/to/logdir
tf.summary.histogram(‘pred’, output)
tf.summary.scalar(‘loss’, loss) # add loss to scalar summary
merge_op = tf.summary.merge_all()
writer = tf.summary.FileWriter(‘./log’, sess.graph) # write to file
for step in range(100):
# train and net output
_, result = sess.run([train_op, merge_op], {tf_x: x, tf_y: y})
writer.add_summary(result, step)
Debugging Python code using breakpoint() and pdb
%debug # launch ipdb debugger
%pdb # set breakpoint (== `import ipdb; ipdb.set_trace()`)
Syntax:
breakpoint() # in Python 3.7
# OR
import pdb; pdb.set_trace() # in Python 3.6 and below.
Python comes with the latest built-in function breakpoint which do the same thing as pdb.set_trace() in Python 3.6 and below versions. Debugger finds the bug in the code line by line where we add the breakpoint, if a bug is found then program stops temporarily then you can remove the error and start to execute the code again.
The easiest way to debug a Jupyter notebook is to use the %debug magic command. Whenever you encounter an error or exception, just open a new notebook cell, type %debug and run the cell. This will open a command line where you can test your code and inspect all variables right up to the line that threw the error.
Type “n” and hit Enter to run the next line of code (The → arrow shows you the current position). Use “c” to continue until the next breakpoint. “q” quits the debugger and code execution.
Since i use python 3.7 mostly, hence i will discuss the Breakpoint only!
def debugger(a, b):
breakpoint()
result = a / b
return resultprint(debugger(5, 0))
Output :
In order to run the debugger just type c and press enter.
Commands for debugging :
c -> continue execution
q -> quit the debugger/execution
n -> step to next line within the same function
s -> step to next line in this function or a called function
iPython debugger is another great option. Import it and use set_trace() anywhere in your notebook to create one or multiple breakpoints. When executing a cell, it will stop at the first breakpoint and open the command line for code inspection. You can also set breakpoints in the code of imported modules, but don’t forget to import the debugger in there as well.
from IPython.core.debugger import set_trace
set_trace()
PixieDebugger
As a prerequisite, install PixieDust using the following pip command: pip install pixiedust. You’ll also need to import it into its own cell: import pixiedust.
pip install pixiedust
import pixiedust
To invoke the PixieDebugger for a specific cell, simply add the %%pixie_debugger magic at the top of the cell and run it. Detailed tutorial at – https://medium.com/codait/the-visual-python-debugger-for-jupyter-notebooks-youve-always-wanted-761713babc62
Original articles – https://j0e1in.github.io/dev_notes/ml/tools/jupyter-colab.html#use-google-drive-as-data-storage
Debugging