blog_tools Module

Classes

BlogManager

class blog_tools.BlogManager(post_dir='')[source]

BlogManager is used by the flask_app to do all the content management for the blog. Flask app handles the actual rendering, but all reports and extracts come from BlogManager.

__init__(post_dir='')[source]
refresh()[source]

Force cache update

property blog_df
property tag_to_post_df
property top_tags
property top_tag_menu

Menu items for top [5] tags

search_tag(regex)[source]

Search tags (list of individual tags) using a regex Only finds one tag type

search_regex(regex, field)[source]

Search through field using a regex and return relevant posts tag, title, html, post_date, access, modify, create, size NOT tag_list, words (a set)

field must be a column in tag_to_post_df

Regex runs a contains query: you are responsible for start/finish

search_query(query)[source]

Send well formed query to blog_df

property tags

Returns an iterable of distinct tags

make_card_list(tag)[source]

return six most recent posts, to be rendered in snapshot cards

list_of_posts(regex)[source]

Return list of matching blog files formatted as a HTML list. Three modes:

  1. regex is !str: run as a regex query against all words

  2. regex contains ” and “, split and run intersection of type 3

  3. regix run as match against tag list.

report(kind)[source]

Create a report about the posts; returns html

kind = title, tag, date, statistics

static blog_entries_to_df(p)[source]

Convert blog entries in directory p (a Path) to a dataframe. Parse tags, title, etc.

  • Must start with an h1; generally that is the only h1 in the document

  • A final comment with a list of tags; if there is no tag it is tagged NOC

  • A span, usually near the top, with class description that becomes the og summary and the card summary. If missing then the first 150 or so (to a word break) are used.

  • An image with class og_image that becomes the og image (not too large!)

static name_to_parts(fn)[source]

fn is a Path

Convert csv list of tags into links glue = ‘, ‘ or ‘

‘ are common

BlogPublisher

class blog_tools.BlogPublisher(source_dir='.', update=False, dry_run=True, tex_engine='pdflatex')[source]
__init__(source_dir='.', update=False, dry_run=True, tex_engine='pdflatex')[source]

Manage creation of HTML blog-post files, including creating and image files and changing links in Markdown. Objective is to published as-is files that create TeX on the web. Adjustments: PDF images to PND/JPG/SVG (change the link and create the PNG) and TikZ (create SVG file, find begin{figure} find caption and change Markdown).

Adds a final comment to the HTML explaining where the file came from.

Creates a .bak file with the same name and including all the edits. These SHOULD NEVER BE EDITED!

web_path is the destination for the created HTML files. If ==’’ then BLOG_PATH is used

If update is True overwrite existing older HTML files, otherwise skip if exists. If dry_run is True just explain what would happen.

Note: defaults in fail safe mode!

The Markdown file can optionally have:

  • A final comment with a list of tags; if there is no tag it is tagged NOC

  • A span, usually near the top, with class description that becomes the og summary and the card summary. If missing then the first 150 or so (to a word break) are used.

  • An image with class og_image that becomes the og image (not too large!)

These elements are used by BlogManager.

Parameters
  • source_dir – source directory for files to publish, default is cwd

  • web_path – website directory; files are published here

  • update

  • dry_run

  • tex_engine – pdflatex (fast but not fonts) or lualatex (slow but change fonts)

publish_file(fn)[source]

fn is a markdown file to post file, a str or Path object. Workflow is

  • read markdown, split, find tags (last comment), make post filename

  • check timing and existing blog post files to see if there are any updates

  • expand all @@@s (per markdown_make)

  • expand all basic tex macros

  • deal with pdf graphics (png/jpg/svg versions of pdf files must be created separately); if none is found, leave as pdf

  • deal with TikZ pictures and figures (after graphics because it introduces new ![] elements)

  • append workflow, including provenance of file

  • Save .bak file

  • pandoc create HTML file

Parameters
  • fn – name of file (or path to file) of markdown

  • tex_engine – if pdflatex uses blog/format/tikz.fmt and pdflatex, giving default fonts. If lualatex runs without template, slower but will give the fonts.

publish_dir(pattern='*.md', tidy=True)[source]

Publish all files matching pattern to web_path

Convert pdf figure links. DOES NOT MAKE the new images (that needs pdf2image (Linux)); it looks for linkely contenders and selects one.

Completely separate from dealing with tikz.

Looks in the same folder for an appropriate non-pdf version of the file: prefers SVG then PNG then JPG.

If no file found then sets link to default format and it is up to you to create that file (noted in the workflow).

Note, these file names are futher tinkered to move them to the website static folder.

See git history for an attempt to use divsvgm -P filename conversion…but those svg files do not render.

PublisherBase

class blog_tools.PublisherBase[source]

Container for some static functions.

Handles workflow tracking

__init__()[source]

Create a link of web_file relative to static_img_path

workflow_reset()[source]
workflow(msg)[source]

Add a message to the workflow :param msg: :return:

workflow_show()[source]
workflow_get()[source]

Return the workflow object as an HTML comment

workflow_raw()[source]
process_includes(*, txt='', fn=None)[source]

Stand-alone process includes. txt = current status of buffer. fn = Path object source. If txt=None then text read from fm. If txt==’’ then txt read from fn. This allows it to be used stand-alone.

Not static because calls functions that access the workflow. But can be part of the base.

Parameters
  • txt

  • fn

Returns

txt with includes resolved.

process_tex_macros(md_in, report=False)[source]

Expand standard general.tex macros in the md_in text blog

If additional_macros is not None then use it to update the standard list

If report is True then just return the dictionary of macro substitutions

static file_name(s)[source]

Create a sensible random file name from a string s

Parameters

s

Returns

static string_hash(s)[source]

Return hash of string s, as a hex string

Parameters

s

Returns

static run_command(command, flag=True)[source]

Run a command and show results. Allows for weird xx behavior

Parameters
  • command

  • flag

Returns

static tidy()[source]

tidy up the cwd

Returns

static convert_pdfs(dir_name, output_folder='', pattern='*.pdf', format='png', dpi=200, transparent=True)[source]

Bulk conversion of all pdfs in dir_name to png. Linux (pdf2image) only. Pre-run! Does not adjust names in the text.

static tex_to_dict(text)[source]

Convert text, a series of def{} macros into a dictionary returns the dictionary and the regex of all keys

static tex_splitter(x)[source]

x is a single def style tex macro

static post_tags_and_dates(dir_path)[source]

Read info from a set of proto posts

Parameters

dir_path

Returns

TikzManager

class blog_tools.TikzManager(*, raw_input='', doc_path=None, tex_engine='pdflatex')[source]
__init__(*, raw_input='', doc_path=None, tex_engine='pdflatex')[source]

Convert tikz figures in input text (raw_input) or a file (doc_path) into stand-alone svg files, saved in web_path (usually the static/img folder).

If raw_input == ‘’ then it is read from doc_path.

doc_path is used to determine if temp .tex files need updating.

When called by BlogPublisher, doc_path text has already been adjusted, hence raw_input.

When called stand-alone raw_input==’’.

static split_tikz(txt)[source]

Split text to get the tikzpicture. Format is

initial text pip then groups of four:

  1. begin tag (1::4)

  2. tikz code (2::4)

  3. end tag (3::4)

  4. non-related text (4::4)

split_figures()[source]
list_tikz()[source]

List the figures in doc_fn

process_tikz()[source]

Process the tikz figures/tables/sidewaystables in the doc into svg files.

Functions

blog_tools.setup_parser()[source]

Set up all command line options and return parser

Returns

parser object

blog_tools.main()[source]

handle command line operation needs to be a function for sphinx argparse. :return:

BlogManager: create and manage blog posts. All posted to the default Blog website (global variable).

usage: blog_tools.py [-h] [-y] [-d SOURCE_DIRECTORY_NAME] [-u] [-a {post_file,post_dir,convert}]
                     [-t {pdflatex,lualatex}] [-f FILE_PATTERN] [-r] [-c CONVERT_FILE_PATTERN] [--format FORMAT]
                     [--dpi DPI]

Named Arguments

-y, --dry_run

dry_run mode: nothing actually done.

Default: False

-d, --directory

Source directory for files, default is cwd.

Default: “”

-u, --update

Update mode: only update files where md is newer than html.

Default: False

-a, --action

Possible choices: post_file, post_dir, convert

Determines the action: post a file, directory, or run pdf converter (Linux only).

-t, --tex

Possible choices: pdflatex, lualatex

Specify TeX engine. pdflatex = fast, no fonts; lualatex = slow with fonts.

Default: “pdflatex”

-f, --files

Files filtered matching FILE_PATTERN. For post or convert. Can be a single filename.

Default: “”

-r, --refresh

Refresh server issuing a curl http://127.0.0.1:5000/blog/reset command.

Default: False

-c, --convert

Convert all files in current directory matching CONVERT_FILE_PATTERN to FORMAT. Run from Linux (smve38_clean). For example, to convert img/.pdf python -m blog_tools -a convert -d img –format=.pdf. Converted files are written to the same directory.

Default: “*.pdf”

--format

Set output file type FORMAT for convert.

Default: “png”

--dpi

Set DPI level for convert.

Default: 200

Examples: (1) python -m blog_tools -a post_file -f *.md posts all markdown files in the current directory. (2) python -m blog_tools -d new_posts -a post_dir posts all markdown files in the directory new_posts. (3) python -m blog_tools -a convert -c *.pdf converts all pdf files in the current directory to 200 dpi PNGs (the defaults, set with –dpi and –format.