Friday 6 December 2013

Big O and Sort Algorithms

Big O notation is a somewhat complex idea for me to wrestle with, or it initially was at least, but after spending a decent amount of time learning about it in 165 I have come to a decent understanding of it by the time we began to discuss it.  Essentially Big O seeks to describe the worst a program's runtime can possibly be, given constant sums and factors of difference (essentially selecting the highest order term such that choosing a lesser order term, regardless of constant multiple, would still be faster than the program given a high enough set of inputs.

For me it was helpful to observe graphs of the different types of inputs and how big-O big-Omega and big-Theta functions would compare to them.  This seems like the simplest way to intuitively understand runtime analysis, in my opinion.

Sort Algorithms are numerous though there are a couple of categories that they typically fall into.  We have looked at Selection, Insertion, Bubble, Merge, and Quick Sort.

Selection seems only to be a good idea to use if you are certain it will only be run over extremely small lists of data, the algorithm runs extremely slowly by seeking out smallest items on each iteration and returning them to the front, but this obviously creates a problematic runtime as it must traverse the entire list one time per item in the list.  Not a recommended sort in most contexts

Insertion is somewhat similar to selection but iterates the opposite way, running through the sorted portion of the list to insert each subsequent item where it belongs within that sort, therefore only traversing the unsorted list once (one item each time) and only traversing the entire sorted list when the next selected item is larger than all existing sorted elements.  Intuitively this makes it actually run slightly slower on already sorted data, and it is not a large improvement over selection, having a similar general runtime.  Also not typically recommended.

Bubble sort is the worst sort algorithm.  Do not use.  Must traverse entire list comparing pairs of elements to move the largest to the back on every iteration.  Very unwieldy.  We have been told there are situations where it may help but none that I have yet seen.

Merge Sort is an efficient algorithm which splits the list and sorts sides separately.  It can generally be an effective algorithm to use.

Quick Sort seems the best of the default choices we have been presented with (aside from builtins) as it can repeatedly split and sort smaller lists, combining them back together into a whole in a very effective and time efficient matter.

Overview of Week 12

This week of the course was perhaps one of the most interesting as it has relied on our ability to link together many of the previous concepts of the course in the creation of a complete binary tree type.  This was a special subclass of a regular binary tree in which each depth must be filled before any nodes at a greater depth can be added, and nodes are populated in a specifically left to right order, then top to bottom, from the root.

This required extensive knowledge both of recursion and trees which have been some of our primary topics of discussion throughout the semester.  It also provided an easy and reliable way to ensure trees are populated in a specific way, which could aid us in future use of trees, if we need to be able to examine trees in a parallel order, if we have access to the list of values, in the order they were populated into a complete binary tree we can bypass our usual methods of traversal and write a much simpler method.  Very interesting stuff!

Overview of Week 11


This week we spent the majority of our time invested in the exploration of the property function in order to encapsulate classes and prevent outside interaction from the user end.  This seems like useful knowledge as Python is somewhat notoriously open with its code (Professor Heap's nudist colony comment comes to mind) and having at least some way of shielding it against direct user interaction provides both a helpful way to keep some code at least partially private and to provide some comfort to users who have come to Python from another programming language which likely has higher standards of privacy and user-code separation.  In that sense, I found it very useful to learn about, having a background in BASIC and web-side design.

Overview of Week 9

Sorting Algorithms and Time Complexity were the main topic for this week, which I found to be relatively easy.  The sorts we discussed were all covered in 108 last year and Big O notation has been the topic in 165 for the last several weeks.  It was good to get a fresh perspective on them, but nothing struck me as overtly complex or beyond quick understanding.

Also this week was the conclusion of Assignment 2.  Regexes are an interesting format though not one I immediately see the practical use for.  In theory however their ability to specify various binary representations and demonstrate applications of recursion and linked lists was very helpful to comprehension.

Overview of Week 8

Binary Search Trees were a primary topic for this week, and understanding of how to effectively traverse a tree which has its elements sorted by side (lesser elements form the left children of each node and greater elements to the right).  This also proved an important concept to me personally as it allowed me to more quickly understand recursion by thinking of binary search trees and their division of sorted elements in a way similar to the merge sort algorithm, which eventually helped me to link various concepts through trees as a recursive structure and gain a better understanding of recursion in general - I wonder if this conceptual link may also aid other students in understanding.

Week 7 Overview

Linked lists have been the primary topic of the week, although I have some trouble understanding instances that would render them preferable to using the more versatile binary trees.  They are also more volatile, as destroying the reference to the root node of a linked list will cause Python to destroy the entire list as the whole set of values is determined solely from reference derived from the preceding value, in a trail leading from the root down to the last value.

Personally I do not see myself using this data type outside of assignments where I may be compelled to though perhaps there are additional circumstances that I have not yet learned about in which the linked list may be preferable.

I haven't had a chance to closely examine Assignment 2 and the properties of regexes yet but some of my friends in the class have told me they feel linked lists may be pertinent to the solution.

Overview of Week 6

Our primary topic for this week was the introduction of binary trees.  This is a special type of object, and one that ties closely to the concept of recursion.  Binary trees are defined recursively in terms of parents and children, filling in levels of the tree from root through branches down to leaves.  This type of data structure can be a useful one for search and sort algorithms, as well as being more practical than a linked list (in most cases) for relating pieces of data to one another, especially since most of the linked lists we've used have only been able to refer to itself and a "next" value, and I don't think it would be possible to construct a type of linked list such that it would refer to anything more than two values (previous and next).

With trees on the other hand, referring to the parent and children provides a possibility of at least 3 references for a node with 2 children, and if we were to construct a tree that allowed for more than 2 children an even larger web of references would be possible.

Furthermore, traversal of trees is best accomplished recursively.  We have looked at inorder, preorder, and postorder traversals of binary trees, and it's been fairly easy to see the potential applications of each.  Postorder seems like it would be the least used, although still relevant depending on the case.  What is important to realize is that as much as we are learning new things about computer science, we are also learning new ways to look at, think about, and process the things we already know how to do.