Answers file ============ Replace the placeholders "..." with your answers. 0. Who are the group members? Gustav Eurén, Hussein Mansour, John Croft Part 1: Dijkstra's algorithm (UCS) ---------------------------------- Run uniform-cost search (Dijkstra's algorithm) on the following shortest path problems. State the number of loop iterations and the shortest path found (as printed by the program). Note: *shortest* means *minimal cost*, not smallest number of edges. 1a. Travel from Lund to Kiruna in the adjacency graph `AdjacencyGraph/citygraph-SE.txt` Loop iterations: 5427 Cost of path: 1826 Number of edges: 31 Shortest path: Lund --[16]-> Löddeköpinge --[69]-> Hjärnarp --[52]-> Åled --[6]-> Oskarström --[22]-> ..... --[68]-> Storuman --[71]-> Sorsele --[233]-> Jokkmokk --[93]-> Gällivare --[121]-> Kiruna 1b. Solve the 3x3-puzzle with starting state `/HFG/BED/C_A/` and goal state `/ABC/DEF/GH_/`. Loop iterations: 483753 Cost of path: 31 Number of edges: 31 Shortest path: /HFG/BED/C_A/ -> /HFG/BED/CA_/ -> /HFG/BE_/CAD/ -> /HF_/BEG/CAD/ -> /H_F/BEG/CAD/ -> ..... -> /ABC/DHE/_GF/ -> /ABC/DHE/G_F/ -> /ABC/D_E/GHF/ -> /ABC/DE_/GHF/ -> /ABC/DEF/GH_/ 1c. Go from from point 23:161 to point 130:211 in the grid graph `GridGraph/AR0011SR.txt`. Loop iterations: 159335 Cost of path: 366.40 Number of edges: 308 Shortest path: 23:161 --[1]-> 23:160 --[1]-> 23:159 --[1]-> 23:158 --[1]-> 23:157 --[1]-> ..... --[1.41]-> 132:207 --[1.41]-> 131:208 --[1]-> 131:209 --[1]-> 131:210 --[1.41]-> 130:211 Part 2: Word ladder ------------------- Use uniform-cost search (Dijkstra's algorithm) to solve the following word ladder problem. State the number of loop iterations and the shortest path found (as printed by the program). 2a. Find the shortest word ladder from "syster" to "broder" using the dictionary `WordLadder/swedish-saldo.txt`. Loop iterations: 71007 Cost of path: 11 Number of edges: 11 Solution path: syster -> sylter -> sylten -> synten -> synden -> ..... -> bauden -> bruden -> bräden -> bräder -> broder Part 3: The A* algorithm ------------------------ Use A* to solve the following sliding puzzles from Part 1. You only have to state the number of loop iterations and calculated distance. 3a. Solve the 3x3-puzzle with starting state `/HFG/BED/C_A/`, and goal state `/ABC/DEF/GH_/`. Loop iterations: 10650 Cost of path: 31 Algorithm | Dijkstra A* Loop iterations | 483753 10650 3b. What is the output if you try to solve the starting state `/HGF/BED/C_A/` instead? (Still the same goal state `/ABC/DEF/GH_/`) In this task, a pair of adjacent tiles in the starting state are swapped. We do not expect this to find a solution, since it can be shown that such swapping of an odd number of pairs of adjacent tiles is impossible while maintaining the state of the rest of the puzzle. As a reduced example, in a 2x2 puzzle it is plainly impossible to swap two tiles whilst also returning to blank tile to its original position. This task would be solvable if there were some way to manipulate the starting state into the starting state for 3a, for which we know a solution exists, but the illegal swap prohibits this. Empirical experimentation indicates that attempting to swap an odd number of pairs of tiles is impossible, while an even number of swaps has a solution. Loop iterations: 483841 Cost of path: No path from /HGF/BED/C_A/ to /ABC/DEF/GH_/ found. Part 4: Guessing the cost ------------------------- Use A* to find shortest paths for the remaining problems from Parts 1 and 2. You only have to state the number of loop iterations and calculated distance. 4a. Go from from point 23:161 to point 130:211 in the grid graph `GridGraph/AR0011SR.txt`. Loop iterations: 95162 Cost of path: 366.40 Algorithm | Dijkstra A* Loop iterations | 159335 95162 4b. Find the shortest word ladder from "syster" to "broder" using the dictionary `WordLadder/swedish-saldo.txt`. Loop iterations: 5099 Cost of path: 11 Algorithm | Dijkstra A* Loop iterations | 71007 5099 Part 5: Reflections ------------------- 5a. What is the approximate improvement factor in the number of loop iterations of A* over Dijkstra? Consider separately grid graphs, word ladders, and sliding puzzles. Try a number of different examples per class. GridGraph: The speedup varies modestly, with the most significant improvement seen where the euclidean distance is a reliable indicator of direction. For the more structured gridgraphs, where the algorithm can make consistent progress in the direction of the goal, a large speedup was observed (1.5X-4X). We posit that the worst case speedup occurs if the shortest path has a more random nature. In the best case, the algorithm can traverse directly towards the target unhindered, resulting in a speedup factor of several 100X, though such trivial cases are rare. Four cases were tested to verify this, and they are described in order of improvement factor: 1) -g graphs/GridGraph/maze-100x50.txt -q 27:25 130:21 Traversal direction of the shortest path is essentially random and the heuristic ceases to be meaningful. Speedup ~1.09X 2) -g graphs/GridGraph/AR0011SR.txt -q 23:161 130:211 The algorithm is forced to traverse in the opposite direction to the goal for a large part of the route. Speedup ~1.7X 3) -g graphs/GridGraph/AR0012SR.txt -q 11:73 85:127 The algorithm spends significant time traversing perpendicular to the direction of the goal. Speedup ~2.4X 4) -g graphs/GridGraph/AR0012SR.txt -q 34:59 111:112 The route towards the goal is mostly unhindered. Speedup ~3.8X WordLadder: The speedup varies extremely widely, depending on the route. Speedups ranging from 1.08X to 732X were observed. This seems plausible, since in the best case, for each given word, there is only one outgoing edge and its heuristic cost decreases (ie. becomes more similar to the goal word). In the worst case, there are many outgoing edges, and they all have the same heuristic cost (ie. they all differ by the same amount from the goal word), so the algorithm is unable to prioritize them. Low speedup could be due to a situation in which, perhaps, there are many good shortest path candidates that progress very far but then fail due to the last node only having a single edge. That is, there are many similar words that the algorithm explores, but very few similar words near the goal word. One could also say that the spanning tree might be quite evenly "fanned out". If the shortest path has a higher intermediate cost than these many candidate shorter paths, then it will take many iterations before the solution is found. For instance, when searching for "övrig", many different starting words converge on the same path that goes "artin"->"övrig". Since "artin" is very dissimilar from "övrig", it suggests that the cost heuristic is not very useful, and so the speedup will be low. Since five letter words are very common, there will be many alternative candidate paths guided by the cost heuristic, that will not converge on the solution path. In cases with extreme speedup, such as "hallå"->"major", the graph is presumably dense, with few "dead ends" along paths guided by the cost heuristic. The average speedup appears to be ~10X-100X, with "hard to reach" (those with critical paths that have few connections to the rest of the graph) words reducing the speedup to a mere ~1X-2X. The following experiments are used in the answer above: Input Dijkstra A* Speedup ----- -------- -- ------- njuta övrig 111367 107365 1.03 njuter övriga 1.45 ämnet åmade 107051 8987 11.91 syster broder 71007 5099 13.92 ämnet öring 92618 1410 65.68 eller glada 16881 211 80.00 hall fett 5428 12 452.33 hallå major 30755 42 732.26 SlidingPuzzle: The speedup for the sliding puzzle is more consistent at about 50X-150X for 4x4 puzzles. The speedup is only very modest for 2x2, since it represents a very small graph and there is little difference between an inefficient solution and one guided by a heuristic. The cost heuristic appears to be very effective and consistent. This may be due to the heuristic having more information, since it uses the position of every tile to calculate a total distance from the goal state. 5b. For which of the three above graph types did the heuristic guessCost have the most impact? What do you think is the reason for this? Both the WordLadder and Sliding Puzzle benefit enormously in the average case, though the Sliding Puzzle sees a more consistent improvement. This is perhaps due to the sliding puzzle having a denser graph representation. That is, more paths towards the goal. Therefore, it will suffer less from having to abandon long dead-ends. 5c. What are the main differences between your implementations of `AStar` and `Dijkstra`? In A*, we have an extra variable to hold the total heuristic cost f(v)=g(v)+h(v) for a particular vertex. h(v) is a cost heuristic provided by the particular problem class. We explicitly use this variable (in __lt__()) to determine priority among neighbouring vertices, whereas Dijkstra just uses g(v), the cost thus far, to determine priority. 5d. What is the asymptotic average-case complexity of `WordLadder.outgoing_edges` simultaneously in: - the alphabet size K and - the length N of the given word? Justify your answer. (Note: all involved sets are hash sets, which have constant time search time.) We iterate over both the word length N and the alphabet size K. In each inner loop, we first change the word by copying it over to a new array (of the fixed word-size) and changing one letter at a time. Copying to a new array is linear in the size of the word N, so O(N). Then we search after the changed word in the hash set, which is a O(1) operation. If a matching element is found, we append to a dynamic array, which has an amortized complexity of O(1). Therefore, the average-case complexity is the same as the worst-case complexity. O(N) * O(K) * O(N) * O(1) = O(KN^2) Appendix: general information ----------------------------- A1. How many hours did you spend on the assignment? John: 12 hours A2. Do you know of any bugs or limitations? ... A3. Did you collaborate with any other students on this lab? If so, write with whom and in what way you collaborated. Also list any resources (including the web) you have used in creating your design. Concerning the existence of solutions for the sliding puzzle: The Fifteen Puzzle A Motivating Example for the Alternating Group (Supplemental Material for Intro to Modern Algebra) - Robert A. Beeler (https://web.archive.org/web/20210107214840/https://faculty.etsu.edu/beelerr/fifteen-supp.pdf) A4. Did you encounter any serious problems? ... A5. Do you have other comments or feedback? ...