|
Continuing Research Projects
|
Pebbles:
- Overview:
The Software Engineering project from
1996. This is used by the Geology department
for students to enter data about pebbles
collected in the Sedimentology class.
It supports data entry, organization,
and analysis of results, including drawing
a variety of graphs to summarize the work
and outputting data in a variety of formats.
- Programming
Language: C++, with one component
in HyperTalk.
- Prerequisites:
CS 195. The scanner portion of this project
is appropriate for a mathematician who
has taken CS 111.
- Status:
The main program, in C++, is functional
in part. It needs additional Macintosh
graphics work to correctly draw the graphs
involved in summarizing the data. The
original design included a HyperTalk scanner
component which could scan in data automatically.
The scanning tools are available, but
the HyperTalk portion remains to be coded.
To do this part correctly, the student
will need to do some design work on the
algorithm to analyze the perimeter of
the pebble.
C
Pointers:
- Overview:
The Software Engineering project from
1998. The intent is to act as a teaching
aid for CS 195 by allowing a student to
practice basic pointer commands and see
the effects demonstrated visually. Exercises
are included that give starting linked
list, doubly linked list, or tree structures,
with instructions to take some action
on the structure.
- Programming
Language: Java.
- Prerequisites:
CS 195, experience with Java (or willingness
to learn Java).
- Status:
The core design and programming is
done, but much work remains.
Finite Automata:
- Overview:
A finite automaton is a simple model of
a computer that contains very little memory,
and is often appropriate to model certain
types of chips or embedded systems. The
machine is in one of a few "states",
and switches from one state to another
depending on the input stimulus to the
machine. Such machines are studied extensively
in CS 310, Theory of Computation. The
program is intended to visually demonstrate
such machines, along with simulating some
of the most important algorithms that
apply to such machines.
- Programming
Language: One version in HyperTalk,
and one in C++.
- Prerequisites:
CS 310, or current enrollment in that
course.
- Status:
The HyperTalk version of the program is
fully functional. There are, however,
some ideas for additional algorithms that
could be included in the program. The
port to C++ is partly completed, but a
significant amount of Macintosh graphics
remain to be coded.
Tilings by Regular Polygons:
- Overview:
A "tiling" is a covering of
the plane by some geometric shapes that
overlap only at their edges. The current
program constructs and draws tilings of
the plane involving squares, triangles,
hexagons, and dodecagons. Darrah has done
fairly extensive descriptions and cataloging
of many special kinds of tilings of the
plane which can be accomplished using
only regular polygons. These types of
tilings are quite amenable to investigation
by computer search tactics, and investigations
extending previous work probably require
such computer tactics. The purpose of
this project is to develop a collection
of objects for doing further research
with such tilings. This development includes
some drivers that should be able to imitate
much of the work classification work on
such tilings done earlier by hand.
- Programming
Language: C++.
- Prerequisites:
CS 203.
- Status:
The core work has been completed and
is a functioning program. Much of the
basic methods are available, involving
a fairly sophisticated data structure
to hold all the connections. Research
involves some additional methods, especially
some recursive algorithms to search for
different kinds of possible tilings. Some
additional capabilities need to be added
to convert this into a program capable
of discovering theorems on its own.
-
-
back to top of page
Freeware Variations of Other Ideas
|
Intrusion
Detection:
- Overview:
"Intrusion Detection" is the
ability to detect someone who is attempting
to break into a computer system or network.
There are a large number of expensive
programs one can purchase that have various
kinds of intrusion detection capabilities,
but little is currently available that
is public domain. Nevertheless, the core
C/C++ code one would need to build such
a system is freely available. Good code
is available for packet sniffing on a
network, and this code should be fairly
easy to extend to looking for certain
types of packets that indicate hacking
attempts.
- Programming
Language: C or C++.
- Prerequisites:
CS 195. CS 245 or other experience with
networks and network protocols
- Status:
Jon van Niewaal is considering working
on this. I have a large notebook of various
"standard" tactics for attacking
a network, most of which include sufficient
description to allow one to recognize
packets involved in that type of attack.
IBM's
Expert Web Site Detection:
- Overview:
Most current search engines for the Web
are very bad at locating the actual "expert"
Web sites for a particular topic. Yahoo
tries to identify such expert sites by
using a large number of human editors
to review sites. IBM suggested what seems
likely to be an effective tactic for automatically
generating lists of the "best"
sites on a particular topic. They suggest
collecting the sites identified by a traditional
search engine, but then classifying each
page according to how many other pages
in this set have links to them. This takes
advantage of the assumption that someone
creating a Web page on X is likely to
know many other pages on that topic, and
will to what they think of as the best
pages for X. Taking a subset of pages
that score well on this measure, i.e.,
pages "respected" by a reasonable
number of authors, then iterating the
process, should be able to generate a
reasonable guess as to the most useful
or definitive starting points for your
search.
- Programming
Language: Java.
- Prerequisites:
CS 195.
- Status:
Java provides easy tools to access
Web pages, extract URL's from those pages,
and analyze the URL's. We have a code
example that does such analysis that would
make a good starting point for the project.
The full algorithm is up for patent consideration,
but the part that's been publicly discussed
can be emulated by someone else, especially
if they're producing freeware.
back
to top of page
Mancala:
- Overview:
Mancala is a class of games that are played
in Africa and many other areas of the
world. They involve playing pieces (seeds,
pebbles, etc.) that are moved around the
board, allowing one person to capture
the pieces of the other player. The exact
form of the rules varies tremendously.
There are many shareware/buyware versions
of this game available, but the idea of
this project is to construct a program
that can play as many different versions
as possible; allowing an exploration of
the many different types of rules that
can be applied. The book Mancala Games,
by Larry Russ (available from Prof. Chavey),
includes rules for more than 100 variations
of this game.
- Programming
Language: Java or HyperTalk.
- Prerequisites:
CS 195.
- Status:
A prototype that plays 50 or so variations
of this game has been built in HyperTalk.
A strong CS 111 student could work with
this program itself to improve the current
program, e.g. allowing it to play more
variations. Converting this program to
Java so that they game variations could
be played over the Web would be of significant
interest.
Solitaire
Game Strategies:
- Overview:
Many games are susceptible to attempts
to analyze their strategy. This type of
work has been done extensively for many
popular games, such as chess, checkers,
bridge, etc. Many solitaire card games,
however, have interesting potential for
such strategy analysis as a research project.
As one example, consider the solitaire
game "FreeCell", commonly played
on Windows machines. The player is allowed
to move cards from one pile to another
according to certain rules, move cards
to or from their "hand", up
to a maximum of four, and place cards
from the Ace up onto piles according to
suit. Certain situations are "bad,"
e.g. holding too many cards in your hand,
having too many cards in one pile, etc.
Other situations are "good",
e.g. having empty/nearly empty piles,
having many cards played to the suit piles,
etc. One can assign different weights
to a variety of good/bad features and
use a standard game tree analysis to look
for good moves. Given one particular weighting,
have the computer play a large number
of random games, and evaluate the efficacy
of its strategy. Have it adjust the weights
and repeat the process. Continue this
process, using any of a variety of tactics
(simple search, genetic programming, simulated
annealing, etc.) to try to find a strong
strategy.
- Programming
Language: Any.
- Prerequisites:
CS 195.
- Status:
I have a reasonable start at a large
number of factors that can be "weighted"
in the game of FreeCell.
Analysis
of Games of Positions:
- Overview:
An interesting class of mathematical games
consists of a graph with playing pieces
on the vertices; the players move the
pieces along the edges until one player
has no remaining move. This project attempts
to analyze which graphs lead to "interesting"
games of this type.
- Prerequisites:
Math 200 and CS 203.
- Status:
Beloit alumnus Jeff Lowrey is working
on this project.
- Details:
Full details are currently available.
-
back
to top of page
Complexity
Analysis:
- Overview:
For CS 111, I have built a "Stack
Analyzer", which does various basic
types of analysis of student code, such
as analysis of their comments, the complexity
of each method, etc. Porting some of these
capabilities to a C++ code analyzer would
be useful. A main idea of this project
is that if one assumes certain programming
standards, it is often easier to analyze
the resulting code -- even without the
compiler techniques used for professional
level programs of this sort. While stringent
coding standards are often inappropriate
in industry, it is reasonable to require
such standards for student programming,
especially in introductory courses.
- The
design of the project involves a series
of passes through the student code analyzing
it from various viewpoints. The first
task is to read in the code files, storing
the code in a linked list of objects --
one for each object in the code being
analyzed. The second pass separates the
code into individual methods, determining
(for example) which comments appear to
go with which comments, and optionally
applying some programming standards (e.g.,
"methods should be separated by two
blank lines") to the code. The third
phase analyzes the comments associated
with each method, e.g. inline comments,
variable declaration comments, method
header comments, etc. After analyzing
the comments, e.g. for quantity or spelling,
statistics about them are kept, but the
comments are discarded. The next phase
might analyze the number of separate variables
used, complain about poor variable names,
etc. At each phase, the project intent
includes "cleaning up" the code
under analysis -- putting it into a standard
form, removing aspects that aren't necessary
for later phases (e.g., the comments),
and generally simplifying the code for
later analysis. Additional analysis phases,
depending on the interest of the student
researcher, can include looking at the
code length for individual methods, structural
complexity (the number of control paths
through loops and conditionals), program
design complexity (the number of method
calls to other types of objects), module
cohesion (the amount of data sharing within
an object), use of parameters and local
variables, potential naming confusion,
etc. The design feature of using multiple
passes for these analyses is partially
intended to support multiple students
working on the project, possibly over
several semesters.
- Programming
Language: C++.
- Prerequisites:
CS 195.
- Status:
HyperTalk prototype to analyze HyperTalk
exists. Some ideas in that product can
be useful in the overall design of the
program, but because of the differences
in analyzing C++ vs. analyzing HyperTalk,
most of the design work and programming
will be original.
Testing
Analysis:
- Overview:
Many professional programs exist that
support testing analysis, but very little
is publicly available. The idea here is
to insert method calls into programs on
each control path that would "remember"
if that path had been executed during
testing of the code. When ready to do
full testing of a program, a student would
first run this program on their code.
They would then test their code, and a
report would be issued specifying which
control paths had been tested, and which
missed. The student could then re-run
their program to do more testing, the
embedded method calls would add newly
tested paths to the previous database
(assuming the code had not been changed),
and a new cumulative test report would
be generated.
- Programming
Language: C++
- Prerequisites:
CS 195.
- Status:
Significant amounts of design work
need to be done, in addition to programming.
-
back
to top of page
Web
Programming:
- Overview:
A tactic, developed by a couple of researchers
at CUNY-Brooklyn, for doing distributed
processing via a Web browser. You set
up a program on one machine, other machines
connect to it via a Web page, and are
given portions of the work to do. Use
this tool to solve some interesting problem
that can be decomposed in this manner.
- Programming
Language: Java
- Prerequisites:
CS 195.
- Status:
Need to find some good projects to
use. I have some ideas here, involving
especially games and discrete math. The
solitaire game project above is one example.
Tony Abell intends to work on this.
-
back
to top of page
|
Applications to Other Areas
|
Music
Analysis:
- Overview:
There is available code to listen to music
and convert it to numeric data. Can you
read that numeric data into a program
and make any sense out of it? For starters,
can you tell me the number of beats per
measure? Can you distinguish between 3/4
and 4/4 time? Can you distinguish between
6/4 and 3/4 time? For a real challenge,
can you distinguish between a Rumba and
a Foxtrot, or between Swing and Foxtrot?
- Programming
Language: Any.
- Prerequisites:
CS 195, good music background, especially
music theory.
- Status:
Not started.
Number
Words & Linguistics:
- Overview:
One branch of linguistics looks at the
similarities between words in different
languages and tries to infer the genealogical
connections between the languages, e.g.,
language family, by the amount of similarity.
A simplified form of this, which is more
tractable but less reliable than the full
problem, is to look at the similarities
in number words. I have a list of the
numbers 1 to 10 in 1800 different languages,
organized according to language family
(and sub-family). Analyze these words,
try to formalize a "linguistic distance"
between two such words, then construct
a graph of the distances between two languages
based on this set of 10 words. See to
what extent you can reconstruct the language
families from this simple form of analysis.
- Programming
Language: Any.
- Prerequisites:
CS 195, some experience with phonetics.
- Status:
Not started.
|
|