Advent of code 2023 - Day 1: Trebuchet?!

Advent of code 2023 - Day 1: Trebuchet?!

2023-12-10
python

This year I try to record my attempt at solving the Advent of Code 2023 riddles. This is Day 1 - see https:adventofcode.com/2023/day/1

Part 1 #

Our first task is the following:

The newly-improved calibration document consists of lines of text; each line originally contained a specific calibration value that the Elves now need to recover. On each line, the calibration value can be found by combining the first digit and the last digit (in that order) to form a single two-digit number.

For example:

1abc2
pqr3stu8vwx
a1b2c3d4e5f
treb7uchet

In this example, the calibration values of these four lines are 12, 38, 15, and 77. Adding these together produces 142.

Consider your entire calibration document. What is the sum of all of the calibration values?

Lets start jupyter in our shell to start coding!

conda activate tf
jupyter lab --no-browser --port=8888

First, load the test document

import pandas as pd
import re

txt = pd.read_table('data/2023-12-01-1-aoc.txt', names=['code'])
txt
                              code
0                   jjfvnnlfivejj1
1                        6fourfour
2                    ninevbmltwo69
3         pcg91vqrfpxxzzzoneightzt
4    jpprthxgjfive3one1qckhrptpqdc
..                             ...
995       583sevenhjxlqzjgbzxhkcl5
996                            81s
997        2four3threesxxvlfqfive4
998        nine6eightsevenzx9twoxc
999    hmbfjdfnp989mfivefiverpzrjs

[1000 rows x 1 columns]

Second, extract the digits. I had to wrap my head around regex matching in python first, because I first tried pandas.extract (which only extracts the first match), then pandas.extractall (which extracts all matches but puts them into a multiindex which makes things more difficult in this case). So I settled for the re.findall version, which returns a list. To concatenate the elements in the list, we take use the join function.

txt['digits'] = txt.loc[:, 'code'].apply(
    lambda x: ''.join(re.findall(r'(\d+)', x)))
txt
                              code digits
0                   jjfvnnlfivejj1      1
1                        6fourfour      6
2                    ninevbmltwo69     69
3         pcg91vqrfpxxzzzoneightzt     91
4    jpprthxgjfive3one1qckhrptpqdc     31
..                             ...    ...
995       583sevenhjxlqzjgbzxhkcl5   5835
996                            81s     81
997        2four3threesxxvlfqfive4    234
998        nine6eightsevenzx9twoxc     69
999    hmbfjdfnp989mfivefiverpzrjs    989

[1000 rows x 2 columns]

Next, combine the first and the last digit and convert the result from string to integer

txt['calibration'] = txt.loc[:, 'digits'].apply(
    lambda x: int(x[0] + x[-1]))
txt
                              code digits  calibration
0                   jjfvnnlfivejj1      1           11
1                        6fourfour      6           66
2                    ninevbmltwo69     69           69
3         pcg91vqrfpxxzzzoneightzt     91           91
4    jpprthxgjfive3one1qckhrptpqdc     31           31
..                             ...    ...          ...
995       583sevenhjxlqzjgbzxhkcl5   5835           55
996                            81s     81           81
997        2four3threesxxvlfqfive4    234           24
998        nine6eightsevenzx9twoxc     69           69
999    hmbfjdfnp989mfivefiverpzrjs    989           99

[1000 rows x 3 columns]

Lastly, get the sum of our calibration numbers

txt.loc[:, 'calibration'].sum()
56465

Part 2 #

Now follows part two:

Your calculation isn’t quite right. It looks like some of the digits are actually spelled out with letters: one, two, three, four, five, six, seven, eight, and nine also count as valid “digits”.

Equipped with this new information, you now need to find the real first and last digit on each line. For example:

two1nine
eightwothree
abcone2threexyz
xtwone3four
4nineeightseven2
zoneight234
7pqrstsixteen

In this example, the calibration values are 29, 83, 13, 24, 42, 14, and 76. Adding these together produces 281.

What is the sum of all of the calibration values?

Okay, let’s see if we can update the pattern matching. To deal with potential overlapping values like oneight which contains one as well as eight, I used the regex positive lookahead ?= as described here. Because this enables capturing overlapping values, I used \d (one digit) instead of \d+ (one or more digits), otherwise digits might double. Afterwards, just replace the spelled out digits with their numerical value.

# for i, r in enumerate(txt.loc[:, 'code']):
#     matches = re.findall(
#         r'(?=(\d|one|two|three|four|five|six|seven|eight|nine))', r)
#     result = ''.join([match for match in matches])
#     result = result.replace('one', '1').replace('two', '2').replace(
#         'three', '3').replace('four', '4').replace('five', '5').replace(
#         'six', '6').replace('seven', '7').replace('eight', '8').replace(
#         'nine', '9')
#     txt.loc[i, 'digits2'] = result
# txt

# a very nice alternative suggested by Tomalak:
digits = '\d one two three four five six seven eight nine'.split()


txt['digits2'] = txt.loc[:, 'code'].apply(lambda v: ''.join(
    str(digits.index(m)) if m in digits else m
    for m in re.findall(rf'(?=({"|".join(digits)}))', v)
))
txt
                              code digits  calibration digits2
0                   jjfvnnlfivejj1      1           11      51
1                        6fourfour      6           66     644
2                    ninevbmltwo69     69           69    9269
3         pcg91vqrfpxxzzzoneightzt     91           91    9118
4    jpprthxgjfive3one1qckhrptpqdc     31           31    5311
..                             ...    ...          ...     ...
995       583sevenhjxlqzjgbzxhkcl5   5835           55   58375
996                            81s     81           81      81
997        2four3threesxxvlfqfive4    234           24  243354
998        nine6eightsevenzx9twoxc     69           69  968792
999    hmbfjdfnp989mfivefiverpzrjs    989           99   98955

[1000 rows x 4 columns]

Now, construct the calibration value as before…

txt['calibration2'] = txt.loc[:, 'digits2'].apply(lambda x: int(x[0] + x[-1]))
txt
                              code digits  calibration digits2  calibration2
0                   jjfvnnlfivejj1      1           11      51            51
1                        6fourfour      6           66     644            64
2                    ninevbmltwo69     69           69    9269            99
3         pcg91vqrfpxxzzzoneightzt     91           91    9118            98
4    jpprthxgjfive3one1qckhrptpqdc     31           31    5311            51
..                             ...    ...          ...     ...           ...
995       583sevenhjxlqzjgbzxhkcl5   5835           55   58375            55
996                            81s     81           81      81            81
997        2four3threesxxvlfqfive4    234           24  243354            24
998        nine6eightsevenzx9twoxc     69           69  968792            92
999    hmbfjdfnp989mfivefiverpzrjs    989           99   98955            95

[1000 rows x 5 columns]

… and get the correct sum!

txt.loc[:, 'calibration2'].sum()
55902

comments powered by Disqus