software engineering

Archived Posts from this Category

FFmpeg’s fps filter, documented

Posted by on 30 Apr 2020 | Tagged as: FFmpeg, robobait, software engineering, web technology

The FFmpeg media editing software is a valuable tool, but its documentation is only barely adequate. It certainly does not answer all the questions I have, as a user trying to understand why FFmpeg is not doing what I want it to do.

Fortunately, FFmpeg is open source, so when the documentation fails, one can read the source. I wanted to learn about presentation time stamps and time bases. The fps video filter source code, in file libavfilter/vf_fps.c, was an instructive read.

I took what I learned from reading that source, and did a complete rewrite of the fps filter documentation. It is longer than the original fps filter documentation (as archived in April 2020 — you can check if the present documentation is any better). I believe the rewrite is more complete and more accurate. I contributed the rewrite to the FFmpeg project. I submitted it as a patch to the ffmpeg developers list. Discussion continues. I don’t know if this contribution will ultimately get accepted.

So, for the benefit of FFmpeg users who are web-searching for answers, here is my documentation of FFmpeg’s fps video filter.

Continue Reading »

Duplicate entry names in a single directory on a file server: solved!

Posted by on 31 Aug 2019 | Tagged as: robobait, software engineering

I have just seen — and solved — the most remarkable thing in a deep corner of my large archive disk: a single directory containing two entries (subdirectories) with the same name and same inode number. I will describe the problem, the diagnosis, and the cure for the benefit of others who encounter the same problem.

I was moving my archive of old files from one Network-Attached Storage (NAS) file server on my home network to another. Both old and new servers use netatalk AFP software to present Mac=style volumes to my Mac computer. Both run an underlying Unix-like OSs and file systems (but different ones for each).

I moved the archive by dragging the top-level directory data folder, using Finder on my Mac, from the old server to the new. Partway through, the copy aborted, with an error message like,  “a directory with the name .externalToolBuilders already exists”. This is remarkable. Each directory on the old server might have many entries or few, but each entry must has a different name. It is one of the fundamental rules of file systems. I was not combining two directories together, where an entry from one directory might collide with an entry with the same name from the other directory. Continue Reading »

How to convert Google Docs to Markdown format

Posted by on 30 Apr 2019 | Tagged as: robobait, software engineering

Recently I needed to convert a Google Docs wordprocessing document to Markdown format (Github’s dialect). A simple web search turned up several hits, most of them unhelpful. I finally found a Google Apps script to do the conversion, which was almost, but not quite, suitable. But with a simple modification, it did the trick. I am sharing it here, in the hope that it will be helpful to someone else searching for “convert Google Docs to Markdown”.

Continue Reading »

Top Posts: How to escape apostrophe (‘) in MySql?

Posted by on 28 Feb 2019 | Tagged as: robobait, software engineering, web technology

I post on various forums around the net, and a few of my posts there get some very gratifying kudos. I’ve been a diligent contributor to StackOverflow, the Q-and-A site for software developers. I’m in the top 5% of contributors overall. Here’s my top-voted answer in StackOverflow currently.

The question, How to escape apostrophe (‘) in MySql?,  was asked by anonymous user4951 in March 2012 (and copy-edited by someone else). In abbreviated form, it was:

Continue Reading »

“2 1+ 1 sections”: a quick way to refer to a part of a picture

Posted by on 31 Oct 2018 | Tagged as: robobait, software engineering

For one of my consulting clients, I found myself writing command-line tools that operate on videos. One tool zoomed in on the portion of the video frame, to let the user examine it closely. How do you tell a command-line tool to zoom in on one portion of video frame? I came up with an idea, which I call “2 1+ 1 sections”. It is a quick way for a user to refer to a part of a picture, using a concise text notation. I haven’t used it for that client, but I’ll post it here in case it comes in useful later on. Continue Reading »

Top Posts: Why Unicode has separate codepoints for “characters with identical glyphs”

Posted by on 31 May 2018 | Tagged as: i18n, multilingual, robobait, software engineering, Unicode

I post on various forums around the net. Sometimes I am able to tap into such inspiration that I want to add that essay to my portfolio. Such was the case here. The question: Why does Unicode have separate codepoints for characters with identical glyphs? My response begins: The short answer to this question is, “Unicode encodes characters, not glyphs”. But like many questions about Unicode, a related answer is “plain text may be plain, but it’s not simple”.… Continue Reading »

How to add an SSL certificate to LiClipse to permit EGit access to a git repo

Posted by on 26 Dec 2017 | Tagged as: robobait, software engineering, web technology

I was contributing to the FFmpeg project recently. They keep their source code in a Git repo, accessed via SSL. I had an awkward error message:

SSL reported: PKIX path building failed:
sun.security.provider.certpath.SunCertPathBuilderException:
unable to find valid certification path to requested target

The problem was that my tool handling the SSL communication lacked the SSL certificate which validated the communication with the project. I could dismiss the error and proceed without validating the SSL security. The better solution was to supply the right SSL certificate to the communication tool, so that it could validate the SSL security with no awkwardness. Here’s how I accomplished that.  This post is offered as search engine fodder, in hopes that others will benefit from these instructions. Continue Reading »

LiClipse (for Mac) includes its own copy of the JRE

Posted by on 10 Dec 2017 | Tagged as: robobait, software engineering

LiClipse is the developers tool I use for writing Python code. Based on the Eclipse IDE, it accepts numerous plugins to support other programming languages like Java and C, and related tools, such as the Git version control system. Eclipse is mostly Java language code, and it runs on a JRE (Java Runtime Environment). Last month, I wanted to contribute code to a git repository which I accessed via HTTPS. That worked more smoothly if I could put an SSL certificate into the JRE, and I’ll skip the details of why for now.

So I looked up the Java Home of the JRE installed on my Mac OS X laptop (short answer: it’s the path output by running /usr/libexec/java_home). I installed the SSL certificate there. It did not work. That was a sign that LiClipse did not use that JRE. Did it perhaps include its own JRE?  After some investigation, I found out the answer: yes!

Here’s the explanation. I hope this helps others. Continue Reading »

When I run “ffmpeg” in the background, how do I prevent “suspended (tty output)”?

Posted by on 04 Nov 2017 | Tagged as: robobait, software engineering

I recently had a problem, “When I run ffmpeg in the background, how do I prevent suspended (tty output)?”. I solved it. Here is my solution, in the hopes that it will help others seeing the same problem.

I have a sh script which calls ffmpeg on several files. When I try to run this script in the background, redirecting output to a file, the job starts but then immediately suspends:

% bin/mp3convert.sh path/a/b &> ~/tmp/log.txt &
[1] 93352
% [1]  + suspended (tty output)  bin/mp3convert.sh path/a/b &>

Continue Reading »

Python multi-line doctests, and “Got nothing” message

Posted by on 31 Jan 2017 | Tagged as: Python, robobait, software engineering

Recently I was writing a Python-language tool, and some of my doctests (text fixtures, within module comments) were failing. When I tried to import the StringIO module in my test, I got a quite annoying message, “Got nothing”, and the test didn’t work as I wanted. I asked StackOverflow. User wim there gave me a crucial insight, but didn’t explain the underlying cause of my problem. I read the doctest code, and came up with an explanation that satisfied me. I am posting it here, as an aid to others. The gist of the insight: What looks like a multi-line doctest fixture is in fact a succession of single-line doctest “Examples”, some which return no useful result but which set up state for later Examples. Each single-line Example should each have a >>> prefix, not a ... prefix. But, there are Examples that require the ... prefix. The difference lies in Python’s definition of an Interactive Statement.

The Question

I posted a question much like this to StackOverflow:

Why is importing a module breaking my doctest (Python 2.7)?

I tried to use a StringIO instance in a doctest in my class, in a Python 2.7 program. Instead of getting any output from the test, I get a response, “Got nothing”.

This simplified test case demonstrates the error:

#!/usr/bin/env python2.7
# encoding: utf-8

class Dummy(object):
    '''Dummy: demonstrates a doctest problem
    >>> from StringIO import StringIO
    ... s = StringIO()
    ... print("s is created")
    s is created
    '''

if __name__ == "__main__":
    import doctest
    doctest.testmod()

Expected behaviour: test passes.

Observed behaviour: test fails, with output like this:

% ./src/doctest_fail.py
**********************************************************************
File "./src/doctest_fail.py", line 7, in __main__.Dummy
Failed example:
    from StringIO import StringIO
    s = StringIO()
    print("s is created")
Expected:
    s is created
Got nothing
**********************************************************************
1 items had failures:
    1 of 1 in __main__.Dummy
***Test Failed*** 1 failures.

Why is this doctest failing? What change to I need to make in order to be able to use StringIO-like functionality (a literal string with a file interface) in my doctests?

(I had originally suspected the StringIO module of being part of the problem. My original question title was, “Why is use of StringIO breaking my doctest (Python 2.7)”. When I realised that suspicion was incorrect, I edited the question on StackOverflow.)

The Answer

StackOverflow expert wim was quick with the crucial insight: “It’s the continuation line syntax (...) that is confusing doctest parser.” Wim then rewrote my example so that it functioned correctly. Excellent!  Thank you, wim.

The Explanation

I wasn’t satisfied, however. I know from  didn’t explain the underlying cause of my problem. I read the doctest code, and came up with an explanation that satisfied me. Below is an improved version of the answer I posted to StackOverflow at the time.

The example fails, because it uses the PS2 syntax (...) instead of PS1 syntax (>>>) in front of separate simple statements.

Change ... to >>>:


#!/usr/bin/env python2.7
# encoding: utf-8

class Dummy(object):
    '''Dummy: demonstrates a doctest problem
    >>> from StringIO import StringIO
    >>> s = StringIO()
    >>> print("s is created")
    s is created
    '''

if __name__ == "__main__":
    import doctest
    doctest.testmod()

Now the corrected example, renamed doctest_pass.py, runs with no errors. It produces no output, meaning that all tests pass:

% ./src/doctest_pass.py

Why is the >>> syntax correct? The Python Library Reference for doctest, 25.2.3.2. How are Docstring Examples Recognized? should be the place to find the answer, but it isn’t terribly clear about this syntax.

Doctest scans through a docstring, looking for “Examples”. Where it sees the PS1 string >>>, it takes everything from there to the end of the line as an Example. It also appends any following lines which begin with the PS2 string ... to the Example (See: _EXAMPLE_RE in class doctest.DocTestParser, lines 584-595). It takes the subsequent lines, until the next blank line or line starting with the PS1 string, as the Wanted Output.

Doctest compiles each Example as a Python “interactive statement”, using the compile() built-in function in an exec statement (See: doctest.DocTestRunner.__run(), lines 1314-1315).

An “interactive statement” is a statement list ending with a newline, or a Compound Statement. A compound statement, e.g. an if or try statement, “in general, […spans] multiple lines, although in simple incarnations a whole compound statement may be contained in one line.” Here is a multi-line compound statement:

if 1 > 0:
    print("As expected")
else:
    print("Should not happen")

A statement list is one or more simple statements on a single line, separated by semicolons.


from StringIO import StringIO
s = StringIO(); print("s is created")

So, the question’s doctest failed because it contained one Example with three simple statements, and no semicolon separators. Changing the PS2 strings to PS1 strings succeeds, because it turns the docstring into a sequence of three Examples, each with one simple statement. Although these three lines work together to set up one test of one piece of functionality, they are not a single test fixture. They are three tests, two of which set up state but do not really test the main functionality.

By the way, you can see the number of Examples which doctest recognises by using the -v flag. Note that it says, “3 tests in __main__.Dummy“. One might think of the three lines as one test unit, but doctest sees three Examples. The first two Examples have no expected output. When the Example executes and generates no output, that counts as a “pass”.


% ./src/doctest_pass.py -v
Trying:
    from StringIO import StringIO
Expecting nothing
ok
Trying:
    s = StringIO()
Expecting nothing
ok
Trying:
    print("s is created")
Expecting:
    s is created
ok
1 items had no tests:
    __main__
1 items passed all tests:
    3 tests in __main__.Dummy
3 tests in 2 items.
3 passed and 0 failed.
Test passed.

Within a single docstring, the Examples are executed in sequence. State changes from each Example are preserved for the following Examples in the same docstring. Thus the import statement defines a module name, the s = assignment statement uses that module name and defines a variable name, and so on. The doctest documentation, 25.2.3.3. What’s the Execution Context?, obliquely discloses this when it says, “examples can freely use … names defined earlier in the docstring being run.”

The preceding sentence in that section, “each time doctest finds a docstring to test, it uses a shallow copy of M’s globals, so that … one test in M can’t leave behind crumbs that accidentally allow another test to work”, is a bit misleading. It is true that one test in M can’t affect a test in a different docstring. However, within a single docstring, an earlier test will certainly leave behind crumbs, which might well affect later tests.

But there is an example doctest, in the Python Library Reference for doctest, 25.2.3.2. How are Docstring Examples Recognized?, which uses ... syntax. Why doesn’t it use >>> syntax? Because that example consists of an if statement, which is a compound statement on multiple lines. As such, its second and subsequent lines are marked with the PS2 strings.  It’s unfortunate that this is the only example of a multi-line fixture in the documentation, because it can be misleading about when to use PS1 instead of PS2 strings.

« Previous PageNext Page »