如何通过命令行将 IPython Notebook 转换为 Python 文件?
- 2025-03-18 08:56:00
- admin 原创
- 62
问题描述:
我正在考虑使用*.ipynb文件作为事实来源,并以编程方式将它们“编译”为 .py 文件以用于计划的作业/任务。
据我所知,执行此操作的唯一方法是通过 GUI。有没有办法通过命令行执行此操作?
解决方案 1:
如果你不想每次保存时都输出 Python 脚本,或者不想重新启动 IPython 内核:
在命令行中,您可以使用nbconvert
:
$ jupyter nbconvert --to script [YOUR_NOTEBOOK].ipynb
作为一种小技巧,您甚至可以通过前置调用(用于任何命令行参数)在IPython 笔记本中调用上述命令!
。在笔记本内部:
!jupyter nbconvert --to script config_template.ipynb
--to script
在添加之前,选项是或,但在转向与语言无关的笔记本系统时它被重命名。--to python
`--to=python`
解决方案 2:
如果要将*.ipynb
当前目录中的所有文件转换为 Python 脚本,可以运行以下命令:
jupyter nbconvert --to script *.ipynb
解决方案 3:
这是不使用 ipython 从 V3 或 V4 ipynb 中提取代码的快速而简单的方法。它不检查单元格类型等。
import sys,json
f = open(sys.argv[1], 'r') #input.ipynb
j = json.load(f)
of = open(sys.argv[2], 'w') #output.py
if j["nbformat"] >=4:
for i,cell in enumerate(j["cells"]):
of.write("#cell "+str(i)+"
")
for line in cell["source"]:
of.write(line)
of.write('
')
else:
for i,cell in enumerate(j["worksheets"][0]["cells"]):
of.write("#cell "+str(i)+"
")
for line in cell["input"]:
of.write(line)
of.write('
')
of.close()
解决方案 4:
Jupytext非常适合用于此类转换。它不仅允许从笔记本转换为脚本,还可以从脚本返回到笔记本。甚至可以以执行形式生成该笔记本。
jupytext --to py notebook.ipynb # convert notebook.ipynb to a .py file
jupytext --to notebook notebook.py # convert notebook.py to an .ipynb file with no outputs
jupytext --to notebook --execute notebook.py # convert notebook.py to an .ipynb file and run it
解决方案 5:
按照前面的示例,但使用新的 nbformat lib 版本:
import nbformat
from nbconvert import PythonExporter
def convertNotebook(notebookPath, modulePath):
with open(notebookPath) as fh:
nb = nbformat.reads(fh.read(), nbformat.NO_CONVERT)
exporter = PythonExporter()
source, meta = exporter.from_notebook_node(nb)
with open(modulePath, 'w+') as fh:
fh.writelines(source.encode('utf-8'))
解决方案 6:
我知道这是一个老话题。我遇到过同样的问题,想通过命令行将 .pynb 文件转换为 .py 文件。
我的搜索带我找到了ipynb-py-convert
通过以下步骤,我能够获得 .py 文件
安装“pip install ipynb-py-convert”
通过命令提示符进入保存ipynb文件的目录
输入命令
> ipynb-py-convert YourFileName.ipynb YourFilename.py
例如: ipynb-py-convert getting-started-with-kaggle-titanic-problem.ipynb getting-started-with-kaggle-titanic-problem.py
上述命令将创建一个名为“YourFileName.py”的 Python 脚本,按照我们的示例,它将创建getting-started-with-kaggle-titanic-problem.py
文件
解决方案 7:
您可以从 IPython API 执行此操作。
from IPython.nbformat import current as nbformat
from IPython.nbconvert import PythonExporter
filepath = 'path/to/my_notebook.ipynb'
export_path = 'path/to/my_notebook.py'
with open(filepath) as fh:
nb = nbformat.reads_json(fh.read())
exporter = PythonExporter()
# source is a tuple of python source code
# meta contains metadata
source, meta = exporter.from_notebook_node(nb)
with open(export_path, 'w+') as fh:
fh.writelines(source)
解决方案 8:
使用 nbconvert 6.07 和 jupyter 客户端 6.1.12:
将 jupyter 笔记本转换为 python 脚本
$ jupyter nbconvert mynotebook.ipynb --to python
将 jupyter 笔记本转换为指定输出文件名的 python 脚本
$ jupyter nbconvert mynotebook.ipnb --to python --output myscript.py
解决方案 9:
将当前目录中所有 *.ipynb 格式文件递归转换为 python 脚本:
for i in *.ipynb **/*.ipynb; do
echo "$i"
jupyter nbconvert "$i" "$i"
done
解决方案 10:
常见的不灵活方式
大多数人会建议使用nbconvert方法,但输出也有点丑陋,包含大量注释:
jupyter nbconvert --to script [YOUR_NOTEBOOK].ipynb
细胞水平提取
由于笔记本基本上是 JSON,因此您可以编写自己的模块,使用“###EXTRACT###”之类的标签作为要提取的单元格的第一行来提取笔记本的特定单元格。
以下是我编写的模块的使用方法:
# Importing Custom Module
from ipynb_exporter import NotebookModuleBuilder as nmb
# Exporting labeled cells
nmb().ipynb_to_file(ipynb_path="mynotebook.ipynb",
label="###EXTRACT###",
py_path="mynotebookcells.py")
以下是模块:
import json
class NotebookModuleBuilder():
""" Class helps you extract code cells from ipynb-files by using tags """
@staticmethod
def _read_json(path: str) -> dict:
""" Reads a json-file and returns a dictionary
Args:
path: Path to jupyter notebook (.ipynb)
Returns:
dictionary representation of notebook
"""
file = open(path, mode= "r", encoding= "utf-8")
myfile = file.read()
myjson = json.loads(myfile)
file.close()
return myjson
@staticmethod
def _get_code_cells(dictionary: dict) -> list:
""" Finds cells of ipynb with code
Args:
dictionary: Dictionary from importing a ipynb notebook
Returns:
List of code cells
"""
code_cells = [cell for cell in dictionary['cells'] if cell['cell_type']=='code']
return code_cells
@staticmethod
def _get_labeled_cells(code_cells: dict, label="###EXPORT###") -> dict:
""" Gets cells with the specified label
Args:
code_cells: Dictionary with code cells
Returns:
Dictionary with labeled cells
"""
label = label + "
"
sourced_cells = [cell for cell in code_cells if len(cell['source']) > 0]
labeled_cells = [cell['source'] for cell in sourced_cells if cell['source'][0]==label]
return labeled_cells
@staticmethod
def _write_to_file(labeled_cells: dict, output_file: str) -> None:
""" Writes the labeled cells to a file
Args:
labeled_cells: Dictionary with cells that should be written to a file
"""
flattened_lists = '
'.join([''.join(labeled_cell[1:]) for labeled_cell in labeled_cells])
file = open(output_file, 'w')
file.write(flattened_lists)
file.close()
def ipynb_to_file(self, ipynb_path: str, py_path: str, label: str = '###EXTRACT###') -> None:
""" Writes cells labeled with ###EXTRACT### in ipynb into a py-file
Args:
label: Lable that in first line of a cell to match
ipynb_path: Input path to ipynb-notebook
py_path: Output path to py-file
"""
json_file = self._read_json(ipynb_path)
code_cells = self._get_code_cells(json_file)
labeled_cells = self._get_labeled_cells(code_cells,label)
self._write_to_file(labeled_cells, py_path)
解决方案 11:
以下示例将一个名为的 Iron Python Notebook 转换a_notebook.ipynb
为一个名为的 Python 脚本,其中a_python_script.py
省去了用关键字标记的单元格remove
,我手动将其添加到我不想出现在脚本中的单元格中,省去了可视化和其他步骤,一旦我完成笔记本,就不需要通过脚本执行这些步骤。
import nbformat as nbf
from nbconvert.exporters import PythonExporter
from nbconvert.preprocessors import TagRemovePreprocessor
with open("a_notebook.ipynb", 'r', encoding='utf-8') as f:
the_notebook_nodes = nbf.read(f, as_version = 4)
trp = TagRemovePreprocessor()
trp.remove_cell_tags = ("remove",)
pexp = PythonExporter()
pexp.register_preprocessor(trp, enabled= True)
the_python_script, meta = pexp.from_notebook_node(the_notebook_nodes)
with open("a_python_script.py", 'w', encoding='utf-8') as f:
f.writelines(the_python_script)
解决方案 12:
没有文件/目录错误
在我工作的 mint [ubuntu] 系统上,尽管已经安装了 jupyter,并且笔记本可以正常工作,但jupyter nbconvert --to script
仍出现错误“没有文件/目录”,直到我单独执行了
sudo apt-get install jupyter-nbconvert
然后转换就一切顺利了。我只是想添加这个以防有人遇到同样的错误(对我来说这很令人困惑,因为我认为没有文件错误指的是笔记本,而笔记本肯定在本地目录中,我花了一段时间才意识到子命令没有安装)。
解决方案 13:
我发现有两种方法可以使用命令行将Jupyter Notebook 转换为纯 Python 脚本。下面是示例 Jupyter Notebook 和两种工具的输出。
使用 nbconvert
这nbconvert
是 Jupyter Notebook 用户界面功能中使用的工具Download as
。它可以用作命令行工具:
jupyter nbconvert --to python notebook.ipynb
Python 脚本示例:
使用 jupytext
这jupytext
是一个用于保持.ipynb
文件与.py
文件同步的包。它还可用于.ipynb
在命令行中转换文件。它支持多种类型的转换:
转换为轻量格式的 Python 脚本
jupytext --to py notebook.ipynb
示例 Python 脚本:
转换为百分比格式的 Python 脚本
jupytext --to py:percent notebook.ipynb
Python 脚本示例:
解决方案 14:
有一个非常好的软件包叫做nb_dev,它是为在 Jupyter Notebooks 中编写 Python 软件包而设计的。nbconvert,
它可以将笔记本转换为 .py 文件,但它更灵活、更强大,因为它有很多很好的附加编写功能,可帮助您开发测试、文档并在 PyPI 上注册软件包。它是由 fast.ai 团队开发的。
它有一点学习曲线,但文档很好,总体来说并不难。
解决方案 15:
您可以通过执行以下操作将所有 Notebook 文件转换为 python 文件:
pip install jupytext
jupytext --to py *.ipynb
解决方案 16:
这是一个jq
在某些情况下可能有用的解决方案。请记住,笔记本只是 json。
jq -r '.cells[] | select(.cell_type == "code") | .source[] | rtrimstr("
")' $filename
解决方案 17:
给出的解决方案仅适用于转换单个 .py 文件。以下是转换目录和子目录中所有 .py 文件的解决方案。
首先,你需要安装一次只能转换一个文件的工具,例如 ipynb-py-convert
pip 安装 ipynb-py-convert
然后 cd 进入包含 .py 文件和目录的文件夹。然后我们对目录和子目录中的所有文件递归运行该工具:
在 powershell 中:
foreach ($f 在 Get-ChildItem "." -Filter *.ipynb -Recurse){ ipynb-py-convert $f.FullName "$($f.FullName.Substring(0,$f.FullName.Length-6)).py"}
现在,如果您想将 .ipynb 转换为 .py,也可以使用批量转换,您可以运行:
foreach ($f 在 Get-ChildItem "." -Filter *.py -Recurse){ ipynb-py-convert $f.FullName "$($f.FullName.Substring(0,$f.FullName.Length-3)).ipynb"}
这在探索 .py 文件时对我帮助很大。我复制了项目,运行了此代码,并在 Jupiter 中快速测试了代码的不同部分作为单元格等等。我希望它能帮助更多人。
解决方案 18:
jupyter nbconvert main.ipynb --to python
解决方案 19:
以下答案对我有用(基于 10 月 25 日 Yaach 的回答)但我需要同时转换多个文件:
启动 Anaconda 提示
CD 到包含多个 ipnyb 文件的目录,例如 C:\myfolder
跑步:
jupyter nbconvert coding-challenge-*.ipynb --to python
注意通配符。结果是同一个文件夹中有多个 coding_challeng-*.py 文件。
解决方案 20:
我已经构建了一个函数来实现这一点。用户无需安装任何东西就可以使用它。
#!/usr/bin/python
# A short routine to convert a Jupyter Notebook to a Python file
import json
def ipynb_to_py(input_ipynb_file,output_py_file=None):
"""
Generate a Python script (.py) that includes all source code from the input Jupyter notebook (.ipynb).
The user can input a Jupyter Notebook file from the current working directory or from a path.
If the name for output Python file is not specified,
the output file name will copy the file name of the input Jupyter Notebook,
but the file exention will be changed from ".ipynb" chanegd to ".py".
And the output Python file will be saved at the same directory of the input Jupyter Notebook.
For example:
ipynb_to_py("test-jupyternotebook.ipynb")
ipynb_to_py("./test-input-dir/test-jupyternotebook.ipynb")
The user can also specify an output file name that ends with ".py".
If the output file name is provided, but no path to output file is added,
the file will be saved at the current working directory.
For example:
ipynb_to_py("test-jupyternotebook.ipynb","test1.py")
ipynb_to_py("./test-input-dir/test-jupyternotebook.ipynb","test2.py")
The user can save out the file at a target directory by adding a path to the output file.
For example:
ipynb_to_py("test-jupyternotebook.ipynb","./test-outputdir/test3.py")
ipynb_to_py("./test-input-dir/test-jupyternotebook.ipynb","./test-output-dir/test4.py")
This function does not edit or delete the original input Jupyter Notebook file.
Args:
-----
input_ipynb_file: The file name string for the Jupyter Notebook (ends with ".ipynb")
output_py_file (optional): The file name for Python file to be created (ends with ".py").
Returns:
--------
A Python file containing all source code in the Jupyter Notebook.
Example usages:
---------------
ipynb_to_py("test-jupyternotebook.ipynb")
ipynb_to_py("./test-input-dir/test-jupyternotebook.ipynb")
ipynb_to_py("test-jupyternotebook.ipynb","test1.py")
ipynb_to_py("test-jupyternotebook.ipynb","./test-outputdir/test2.py")
ipynb_to_py("test-jupyternotebook.ipynb","./test-outputdir/test3.py")
ipynb_to_py("./test-input-dir/test-jupyternotebook.ipynb","./test-output-dir/test4.py")
"""
# Check if the input file is a Jupyter Notebook
if input_ipynb_file.endswith(".ipynb"):
# Open the input Jupyter Notebook file
notebook = open(input_ipynb_file)
# Read its content in the json format
notebook_content = json.load(notebook)
# Only extract the source code snippet from each cell in the input Jupyter Notebook
source_code_snippets = [cell['source'] for cell in notebook_content['cells']]
# If the name for output Python file is not specified,
# The name of input Jupyter Notebook will be used after changing ".ipynb" to ".py".
if output_py_file == None:
output_py_file = input_ipynb_file.split('.ipynb')[0]+".py"
else:
pass
# Create a Python script to save out all the extracted source code snippets
output_file = open(output_py_file,'w')
# Print out each line in each source code snippet to the output file
for snippet in source_code_snippets:
for line in snippet:
# Use end='' to avoid creating unwanted gaps between lines
print(line,end = '',file = output_file)
# At end of each snippet, move to the next line before printing the next one
print('',sep = '
',file=output_file)
# Close the output file
output_file.close()
print("The path to output file:",output_py_file)
else:
print("The input file must be a Jupyter Notebook (in .ipynb format)!")
def main():
pass
if __name__ == "__main__":
main()
解决方案 21:
我遇到了这个问题,并尝试在网上寻找解决方案。虽然我找到了一些解决方案,但它们仍然存在一些问题,例如,Untitled.txt
当你从仪表板启动新笔记本时,烦人的自动创建。
所以最终我写了自己的解决方案:
import io
import os
import re
from nbconvert.exporters.script import ScriptExporter
from notebook.utils import to_api_path
def script_post_save(model, os_path, contents_manager, **kwargs):
"""Save a copy of notebook to the corresponding language source script.
For example, when you save a `foo.ipynb` file, a corresponding `foo.py`
python script will also be saved in the same directory.
However, existing config files I found online (including the one written in
the official documentation), will also create an `Untitile.txt` file when
you create a new notebook, even if you have not pressed the "save" button.
This is annoying because we usually will rename the notebook with a more
meaningful name later, and now we have to rename the generated script file,
too!
Therefore we make a change here to filter out the newly created notebooks
by checking their names. For a notebook which has not been given a name,
i.e., its name is `Untitled.*`, the corresponding source script will not be
saved. Note that the behavior also applies even if you manually save an
"Untitled" notebook. The rationale is that we usually do not want to save
scripts with the useless "Untitled" names.
"""
# only process for notebooks
if model["type"] != "notebook":
return
script_exporter = ScriptExporter(parent=contents_manager)
base, __ = os.path.splitext(os_path)
# do nothing if the notebook name ends with `Untitled[0-9]*`
regex = re.compile(r"Untitled[0-9]*$")
if regex.search(base):
return
script, resources = script_exporter.from_filename(os_path)
script_fname = base + resources.get('output_extension', '.txt')
log = contents_manager.log
log.info("Saving script at /%s",
to_api_path(script_fname, contents_manager.root_dir))
with io.open(script_fname, "w", encoding="utf-8") as f:
f.write(script)
c.FileContentsManager.post_save_hook = script_post_save
要使用此脚本,您可以将其添加到~/.jupyter/jupyter_notebook_config.py
:)
请注意,您可能需要重新启动 jupyter notebook/lab 才能使其正常工作。
解决方案 22:
魔法命令%notebook foo.ipynb
将把当前的 IPython 导出到“foo.ipynb”。
更多信息请输入%notebook?
扫码咨询,免费领取项目管理大礼包!