about
faculty
courses
major requirements
minor and other programs
undergraduate research
department facilities
extracurricular activities
internship and job search
after graduation
 home > undergraduate research > Darrah
Continuing Research Projects / Freeware Variations of Other Ideas / Games / Code Analysis / Web Programming / Applications to Other Areas
 Continuing Research Projects

Pebbles:

Overview: The Software Engineering project from 1996. This is used by the Geology department for students to enter data about pebbles collected in the Sedimentology class. It supports data entry, organization, and analysis of results, including drawing a variety of graphs to summarize the work and outputting data in a variety of formats.
Programming Language: C++, with one component in HyperTalk.
Prerequisites: CS 195. The scanner portion of this project is appropriate for a mathematician who has taken CS 111.
Status: The main program, in C++, is functional in part. It needs additional Macintosh graphics work to correctly draw the graphs involved in summarizing the data. The original design included a HyperTalk scanner component which could scan in data automatically. The scanning tools are available, but the HyperTalk portion remains to be coded. To do this part correctly, the student will need to do some design work on the algorithm to analyze the perimeter of the pebble.

C Pointers:

Overview: The Software Engineering project from 1998. The intent is to act as a teaching aid for CS 195 by allowing a student to practice basic pointer commands and see the effects demonstrated visually. Exercises are included that give starting linked list, doubly linked list, or tree structures, with instructions to take some action on the structure.
Programming Language: Java.
Prerequisites: CS 195, experience with Java (or willingness to learn Java).
Status: The core design and programming is done, but much work remains.

Finite Automata:

Overview: A finite automaton is a simple model of a computer that contains very little memory, and is often appropriate to model certain types of chips or embedded systems. The machine is in one of a few "states", and switches from one state to another depending on the input stimulus to the machine. Such machines are studied extensively in CS 310, Theory of Computation. The program is intended to visually demonstrate such machines, along with simulating some of the most important algorithms that apply to such machines.
Programming Language: One version in HyperTalk, and one in C++.
Prerequisites: CS 310, or current enrollment in that course.
Status: The HyperTalk version of the program is fully functional. There are, however, some ideas for additional algorithms that could be included in the program. The port to C++ is partly completed, but a significant amount of Macintosh graphics remain to be coded.

Tilings by Regular Polygons:

Overview: A "tiling" is a covering of the plane by some geometric shapes that overlap only at their edges. The current program constructs and draws tilings of the plane involving squares, triangles, hexagons, and dodecagons. Darrah has done fairly extensive descriptions and cataloging of many special kinds of tilings of the plane which can be accomplished using only regular polygons. These types of tilings are quite amenable to investigation by computer search tactics, and investigations extending previous work probably require such computer tactics. The purpose of this project is to develop a collection of objects for doing further research with such tilings. This development includes some drivers that should be able to imitate much of the work classification work on such tilings done earlier by hand.
Programming Language: C++.
Prerequisites: CS 203.
Status: The core work has been completed and is a functioning program. Much of the basic methods are available, involving a fairly sophisticated data structure to hold all the connections. Research involves some additional methods, especially some recursive algorithms to search for different kinds of possible tilings. Some additional capabilities need to be added to convert this into a program capable of discovering theorems on its own.
 
 

back to top of page

 Freeware Variations of Other Ideas

Intrusion Detection:

Overview: "Intrusion Detection" is the ability to detect someone who is attempting to break into a computer system or network. There are a large number of expensive programs one can purchase that have various kinds of intrusion detection capabilities, but little is currently available that is public domain. Nevertheless, the core C/C++ code one would need to build such a system is freely available. Good code is available for packet sniffing on a network, and this code should be fairly easy to extend to looking for certain types of packets that indicate hacking attempts.
Programming Language: C or C++.
Prerequisites: CS 195. CS 245 or other experience with networks and network protocols
Status: Jon van Niewaal is considering working on this. I have a large notebook of various "standard" tactics for attacking a network, most of which include sufficient description to allow one to recognize packets involved in that type of attack.

IBM's Expert Web Site Detection:

Overview: Most current search engines for the Web are very bad at locating the actual "expert" Web sites for a particular topic. Yahoo tries to identify such expert sites by using a large number of human editors to review sites. IBM suggested what seems likely to be an effective tactic for automatically generating lists of the "best" sites on a particular topic. They suggest collecting the sites identified by a traditional search engine, but then classifying each page according to how many other pages in this set have links to them. This takes advantage of the assumption that someone creating a Web page on X is likely to know many other pages on that topic, and will to what they think of as the best pages for X. Taking a subset of pages that score well on this measure, i.e., pages "respected" by a reasonable number of authors, then iterating the process, should be able to generate a reasonable guess as to the most useful or definitive starting points for your search.
Programming Language: Java.
Prerequisites: CS 195.
Status: Java provides easy tools to access Web pages, extract URL's from those pages, and analyze the URL's. We have a code example that does such analysis that would make a good starting point for the project. The full algorithm is up for patent consideration, but the part that's been publicly discussed can be emulated by someone else, especially if they're producing freeware.

back to top of page

 Games

Mancala:

Overview: Mancala is a class of games that are played in Africa and many other areas of the world. They involve playing pieces (seeds, pebbles, etc.) that are moved around the board, allowing one person to capture the pieces of the other player. The exact form of the rules varies tremendously. There are many shareware/buyware versions of this game available, but the idea of this project is to construct a program that can play as many different versions as possible; allowing an exploration of the many different types of rules that can be applied. The book Mancala Games, by Larry Russ (available from Prof. Chavey), includes rules for more than 100 variations of this game.
Programming Language: Java or HyperTalk.
Prerequisites: CS 195.
Status: A prototype that plays 50 or so variations of this game has been built in HyperTalk. A strong CS 111 student could work with this program itself to improve the current program, e.g. allowing it to play more variations. Converting this program to Java so that they game variations could be played over the Web would be of significant interest.

Solitaire Game Strategies:

Overview: Many games are susceptible to attempts to analyze their strategy. This type of work has been done extensively for many popular games, such as chess, checkers, bridge, etc. Many solitaire card games, however, have interesting potential for such strategy analysis as a research project. As one example, consider the solitaire game "FreeCell", commonly played on Windows machines. The player is allowed to move cards from one pile to another according to certain rules, move cards to or from their "hand", up to a maximum of four, and place cards from the Ace up onto piles according to suit. Certain situations are "bad," e.g. holding too many cards in your hand, having too many cards in one pile, etc. Other situations are "good", e.g. having empty/nearly empty piles, having many cards played to the suit piles, etc. One can assign different weights to a variety of good/bad features and use a standard game tree analysis to look for good moves. Given one particular weighting, have the computer play a large number of random games, and evaluate the efficacy of its strategy. Have it adjust the weights and repeat the process. Continue this process, using any of a variety of tactics (simple search, genetic programming, simulated annealing, etc.) to try to find a strong strategy.
Programming Language: Any.
Prerequisites: CS 195.
Status: I have a reasonable start at a large number of factors that can be "weighted" in the game of FreeCell.

Analysis of Games of Positions:

Overview: An interesting class of mathematical games consists of a graph with playing pieces on the vertices; the players move the pieces along the edges until one player has no remaining move. This project attempts to analyze which graphs lead to "interesting" games of this type.
Prerequisites: Math 200 and CS 203.
Status: Beloit alumnus Jeff Lowrey is working on this project.
Details: Full details are currently available.
 

back to top of page

 Code Analysis

Complexity Analysis:

Overview: For CS 111, I have built a "Stack Analyzer", which does various basic types of analysis of student code, such as analysis of their comments, the complexity of each method, etc. Porting some of these capabilities to a C++ code analyzer would be useful. A main idea of this project is that if one assumes certain programming standards, it is often easier to analyze the resulting code -- even without the compiler techniques used for professional level programs of this sort. While stringent coding standards are often inappropriate in industry, it is reasonable to require such standards for student programming, especially in introductory courses.
    The design of the project involves a series of passes through the student code analyzing it from various viewpoints. The first task is to read in the code files, storing the code in a linked list of objects -- one for each object in the code being analyzed. The second pass separates the code into individual methods, determining (for example) which comments appear to go with which comments, and optionally applying some programming standards (e.g., "methods should be separated by two blank lines") to the code. The third phase analyzes the comments associated with each method, e.g. inline comments, variable declaration comments, method header comments, etc. After analyzing the comments, e.g. for quantity or spelling, statistics about them are kept, but the comments are discarded. The next phase might analyze the number of separate variables used, complain about poor variable names, etc. At each phase, the project intent includes "cleaning up" the code under analysis -- putting it into a standard form, removing aspects that aren't necessary for later phases (e.g., the comments), and generally simplifying the code for later analysis. Additional analysis phases, depending on the interest of the student researcher, can include looking at the code length for individual methods, structural complexity (the number of control paths through loops and conditionals), program design complexity (the number of method calls to other types of objects), module cohesion (the amount of data sharing within an object), use of parameters and local variables, potential naming confusion, etc. The design feature of using multiple passes for these analyses is partially intended to support multiple students working on the project, possibly over several semesters.
Programming Language: C++.
Prerequisites: CS 195.
Status: HyperTalk prototype to analyze HyperTalk exists. Some ideas in that product can be useful in the overall design of the program, but because of the differences in analyzing C++ vs. analyzing HyperTalk, most of the design work and programming will be original.

Testing Analysis:

Overview: Many professional programs exist that support testing analysis, but very little is publicly available. The idea here is to insert method calls into programs on each control path that would "remember" if that path had been executed during testing of the code. When ready to do full testing of a program, a student would first run this program on their code. They would then test their code, and a report would be issued specifying which control paths had been tested, and which missed. The student could then re-run their program to do more testing, the embedded method calls would add newly tested paths to the previous database (assuming the code had not been changed), and a new cumulative test report would be generated.
Programming Language: C++
Prerequisites: CS 195.
Status: Significant amounts of design work need to be done, in addition to programming.
 

back to top of page

 Web Programming

Web Programming:

Overview: A tactic, developed by a couple of researchers at CUNY-Brooklyn, for doing distributed processing via a Web browser. You set up a program on one machine, other machines connect to it via a Web page, and are given portions of the work to do. Use this tool to solve some interesting problem that can be decomposed in this manner.
Programming Language: Java
Prerequisites: CS 195.
Status: Need to find some good projects to use. I have some ideas here, involving especially games and discrete math. The solitaire game project above is one example. Tony Abell intends to work on this.
 

back to top of page

 Applications to Other Areas

Music Analysis:

Overview: There is available code to listen to music and convert it to numeric data. Can you read that numeric data into a program and make any sense out of it? For starters, can you tell me the number of beats per measure? Can you distinguish between 3/4 and 4/4 time? Can you distinguish between 6/4 and 3/4 time? For a real challenge, can you distinguish between a Rumba and a Foxtrot, or between Swing and Foxtrot?
Programming Language: Any.
Prerequisites: CS 195, good music background, especially music theory.
Status: Not started.

Number Words & Linguistics:

Overview: One branch of linguistics looks at the similarities between words in different languages and tries to infer the genealogical connections between the languages, e.g., language family, by the amount of similarity. A simplified form of this, which is more tractable but less reliable than the full problem, is to look at the similarities in number words. I have a list of the numbers 1 to 10 in 1800 different languages, organized according to language family (and sub-family). Analyze these words, try to formalize a "linguistic distance" between two such words, then construct a graph of the distances between two languages based on this set of 10 words. See to what extent you can reconstruct the language families from this simple form of analysis.
Programming Language: Any.
Prerequisites: CS 195, some experience with phonetics.
Status: Not started.
[Home] [Back] [Top]
 
Beloit College - Department of Mathematics and Computer Science
Copyright © 2002 - Beloit College. All right reserved.