The Scientific Method

Reality → Theory

A theory lets us make predictions on the reality.

  1. Repeat an experiment
  2. Generalize it (try it with different things)
  3. Make a hypothesis
  4. Test the hypothesis
    • Make a prediction
    • Perform an experiment
    • Make an observation
    • Support / Refine / Reject the hypothesis
  5. Your hypothesis has become a theory
  6. Go to step 4

A theory is a framework that explains and predicts observations

Using

  • Code
  • Runs
  • Failures
  • More runs

we can obtain a 'theory' (or 'diagnosis') of what the error is.

Input  Expected  Output 
<b>foo</b> foo foo
"<b>foo</b>" "foo" <b>foo</b>

Which hypotheses do you think are consistent with these observations?

  1. Double quotes are stripped from tagged input
  2. Tags in double quotes are not stripped
  3. The tag <b> is always stripped from the input
  4. The word bar is always stripped from the input
Input  Expected  Output 
<b>foo</b> foo foo
"<b>foo</b>" "foo" <b>foo</b>

Which hypotheses do you think are consistent with these observations?

  1. Double quotes are stripped from tagged input
  2. Tags in double quotes are not stripped
  3. The tag <b> is always stripped from the input
  4. The word bar is always stripped from the input

These may be two separate issues, but are more likely to be linked.

Looking at the first hypothesis

If this is correct then, "foo" would become foo. Let's test that:

#!/usr/bin/python3

def removeHtmlMarkup(s):
    tag   = False                           
    quote = False
    out   = ""

    for c in s:
        if c == '<' and not quote:          # Start of markup
            tag = True
        elif c == '>' and not quote:        # End of markup  
            tag = False
        elif c == '"' or c == "'" and tag:  # Quote          
            quote = not quote               
        elif not tag:
            out = out + c

    return out

""" Our hypotheses """
if __name__ == "__main__":
    print (removeHtmlMarkup('"foo"'), '\tExpect "foo"\tHypothesis: foo')
#   Test the hypothesis about bar
#    print (removeHtmlMarkup('"bar"'), '\tExpect "bar"\tHypothesis: [BLANK]')
#   Test if it matters whether there is anything else present
#    print (removeHtmlMarkup('""'),    '\tExpect ""\tHypothesis: [BLANK]')

[Download strip7.py]

We now know that we can refine the hypothesis:

  1. Double quotes are stripped from tagged input
  • We now know that double quotes are being stripped in general
  • There must be part of the code that does that
  • Our code only has one line that does this!

        elif c == '"' or c == "'" and tag:  # Quote          
            quote = not quote

We make a new hypothesis about the error

  • The error is a result of the variable tag being set.