CARMA Retreat 2015
9:00 am - 6:00 pm, Saturday 5th September
Abstracts
 Speaker: Salman Cheema
 Title: How Large Should a Large Aggregate Association Index (AAI) be?
 Supervisors: Eric Beh and Irene Hudson
 Abstract:
 Aggregate data arises in situations where survey research or other means of collecting individual-level data are either infeasible or inefficient. The recent increasing use of 
aggregate data in the statistical and allied fields . including epidemiology, education and social sciences . has arisen due to number of reasons. These include the questionable 
reliability of estimates when sensitive information is required, the imposition of strict confidentiality policies on data by government and other organisational bodies and in some 
contexts it is impossible to collect the information that is needed. In this paper we present a novel approach to quantify the statistical significance of the extent of association that 
exists between two dichotomous variables when only the aggregate data is available. This is achieved by examining a newly developed index, called the aggregate association index (or the 
AAI), developed by Beh (2008 and 2010) which enumerates the overall extent of association about individuals that may exist at the aggregate level when individual level data is not 
available.
 The applicability of the technique is demonstrated by using leukaemia relapse data of Cave et al. (1998). This data is presented in the form of a contingency table that 
cross-classifies the follow up status of leukaemia relapse by whether cancer traces were found (or not) on the basis of polymerase child reaction (PCR) . a modern method used to detect 
cancerous cells in the body assumed superior than conventional for that period, microscopic identification. Assuming that the joint cell frequencies of this table are not available, and 
that the only available information is contained in the aggregate data, we first quantify the extent of association that exists between both variables by calculating the AAI. This index 
shows that the likelihood of association is high. As the AAI has been developed by exploiting Pearson.s chi-squared statistics, the AAI inherently suffers from the well-known large 
sample size effect that can overshadow the true nature of the association shown in the aggregate data of a given table. However, in this presentation we show that the impact of sample 
size can be isolated by generating a pseudo population of 2x2 tables under the given sample size. Therefore, the focus of this paper is to present an approach to help answer the 
question .is this high AAI value statistically significant or not?. by using aggregate data only. The answer to this question lies we believe, in the calculation of the p-value of the 
nominated index. We shall present a new method of numerically quantifying the p-value of the AAI thereby gaining new insights into the statistical significance of the association 
between two dichotomous variables when only aggregate level information is available. The pseudo p-value approach suggested in this paper enhances the applicability of the AAI and thus 
can be considered a valuable addition to the literature of aggregate data analysis.
 Key word: Aggregate data, Aggregate Association Index, pseudo p values, Ecological inference, sample size
 
 
 Speaker: David Franklin
 Title: Hardy Spaces and Paley-Wiener Spaces for Clifford-valued functions
 Supervisor: Jeff Hogan
 Abstract:
 In one dimension, the Hardy and Paley-Wiener Spaces relate the support of a function's Fourier transform to the growth rate of its analytic extension. In this talk we show that 
analogues of these spaces exist for Clifford-valued functions in n dimensions, using the Clifford-Fourier Transform of Brackx et al and the monogenic (n+1 dimensional) extension of 
these functions.
  
 Speaker: Ohad Giladi
 Title: Concentration around a hyperplane in a quasi-normed space
 Supervisor: Jon Borwein
 Abstract:
 It is shown that if the small ball probability of a random sum of vectors in a finite dimensional quasi-normed space does not decay too fast, then many of them are concentrated around 
a hyperplane. Joint work with Omer Friedland (Paris VI) and Olivier Guédon Paris Est).
  
 Speaker: Cyriac Grigorious
 Title: On the Metric Dimension of Extended de Bruijn Digraphs and Extended Kautz Digraphs
 Supervisors: Mirka Miller and Joe Ryan
 Abstract:
 A metric basis for a digraph $G(V,A)$ is a minimum set $W \subset V$
such that for each pair of vertices $u$ and $v$ of $V$, there is a vertex $w \in W$ such
that the length of a shortest directed path from $w$ to $u$ is different from the
length of a shortest directed path from $w$ to $v$ in $G$; that is $d(w,u) \neq d(w,v)$.
The cardinality of a metric basis of $G$ is called the metric dimension and is
denoted by $\beta(G)$. We solve the metric dimension problem for extended
de Bruijn and extended Kautz graphs.
  
 Speaker: John Harrison
 Title: A symptotic behaviour of matrix random walks
 Supervisor: George Willis
 Abstract:
 I will describe the Poisson boundary of a discrete family of matrix groups under certain weak restrictions. The Poisson boundary is a space associated with every random walk on a 
locally compact group which encapsulates the behaviour of the walks at infinity and gives a description of certain harmonic functions on the group in terms of the essentially bounded 
functions on the boundary. I will introduce random walks and the Poisson boundary during the talk.
  
 Speaker: Colin Reid — DECRA
 Title: Chief series of locally compact groups
 Abstract:
 Locally compact groups are groups that also have a locally compact topology, compatible with the group structure.  This class of groups includes every group with the discrete topology 
(where the topology is irrelevant) and also compact groups ('small' from a topological perspective), but a general locally compact group does not decompose into compact and discrete 
groups.  A chief factor of a locally compact group G is a quotient K/L, such that K and L are closed normal subgroups of G and no closed normal subgroup of G lies strictly between K and 
L.  A chief series is a series $1 < K_1 < ... < K_n = G$ of closed normal subgroups of G such that $K_{i+1}/K_i$ is a chief factor for all i.  Locally compact groups do not generally 
have finite chief series, as they have too many compact and discrete factors.  However, it turns out that if G is generated by a compact subset, then G has a finite 'essentially' chief 
series, such that every factor is compact, discrete or a chief factor.  The 'large' chief factors are also unique in a certain sense.  This is joint work with Phillip Wesolek.
 
 
 Speaker: Björn Rüffer — new lecturer
 Title:  The dynamics of monotone vector inequalities
 Abstract:
 We will investigate nonlinear versions of the relation between a monotone vector inequality and qualitative properties of an associated dynamics.  The linear version of this vector 
inequality is $x \leq Ax + b$ (component-wise), where $x$ and $b$ are a non-negative vectors, $A$ a non-negative matrix, and the objective is to bound $x$. The matrix $A$ induces a 
dynamical system and, in the linear case, stability properties of that system are in one-to-one correspondence with the satisfiability of the inequality. For the nonlinear case we 
replace $A$ by a monotone mapping and obtain some interesting insights.
  
 Speaker: Amir Salehipour
 Title: Tools to analyze impact of backtest overfitting on investment strategies
 Supervisor: Jon Borwein
 Abstract:
 In mathematical finance, backtest overfitting means the usage of historical market data (a backtest) to develop an investment strategy, where too many variations of the strategy are 
tried, relative to the amount of data available. Backtest overfitting is now thought to be a primary reason why quantitative investment models and strategies that look good on paper 
often disappoint in practice. In this talk we introduce two online tools, the Backtest Overfitting Demonstration Tool, or BODT and the Tenure Maker Simulation Tool, or TMST, which 
illustrate the impact of overfitting on investment models and strategies.
  
 Speaker: Sudeep Stephen
 Title: Power Domination in de Bruijn and Kautz digraphs
 Supervisors: Mirka Miller and Joe Ryan
 Abstract:
 Let $G(V,A)$ be a connected digraph. We call a set $W$ of vertices
critical if there is no vertex outside $W$ which has fewer than two neighbors in
$W$. In other words, $W \subseteq V$ is critical if $N^{+}_{W}(i) > 1$ for every $i \in N^{^{-}}_{W}$
If $W$ is
critical, but no proper subset of $W$ is critical, then we call $W$ minimal critical.
A vertex set $S$ is a power dominating set if and only if $N^{+}_{G}(S) \cap W \neq 0$ for
every minimal critical set $W$. In this talk, I discuss the results obtained for
power domination problem in de Bruijn and Kautz digraphs.
  
 Speaker: R. Sundara Rajan
 Title: On Network Embeddings
 Supervisors: Mirka Miller and Joe Ryan
 Abstract:
 Interconnection networks provide an effective mechanism for exchanging data between processors in a parallel computing system. An interconnection network is often represented as a graph, where nodes represent 
processors and edges correspond to communication links between processors, the design and analysis of an interconnection network is such that it possess excellent graph embedding ability in order to efficiently execute 
parallel algorithms. Network embedding is an important technique used in the study of computational capabilities of processor interconnection networks and task distribution. The quality of an embedding can be measured 
by certain cost criteria, namely dilation, congestion and wirelength. My seminar mainly focuses on wirelength on network embeddings.
  
 Speaker: Garth Tarr — new lecturer
 Title: Robust methods and model selection
 Abstract:
 This presentation outlines two aspects of my research: robust statistics and model selection.
 Standard robust statistical procedures assume that less than half the observation rows of a data matrix are contaminated, which may not be a realistic assumption when the number of 
variables is large.  I will give an overview of some recent research into the problem of estimating covariance and precision matrices under a cellwise contamination model (Tarr et al., 
2015a).
 I will also briefly discuss the R package that provides a collection of functions designed to help users visualise the stability of the variable selection process (Tarr, et al., 
2015b).  A browser based graphical user interface is provided to facilitate interaction with the results.  We have developed routines for modified versions of the simplified adaptive 
fence procedure (Jiang et al., 2009) and other graphical tools such as variable inclusion plots and model selection plots (Müller and Welsh, 2010; Murray et al., 2013). We also 
propose extensions to higher dimensional models using via bootstrapping lasso estimates and incorporate robustness to outliers via an initial screening process (Filzmoser et al., 
2008).
 
 
 Speaker: Dushyant Tanna
 Title: Reflexive irregular total labeling
 Supervisors: Mirka Miller and Joe Ryan
 Abstract:
 Graph labelling is a mapping from a subset of graph elements to a set of numbers (usually non negative integers). In most cases, the domain of mapping is the set of vertices (called 
vertex labelling) or set of edges (called edge labelling) or both (called total labelling). We define the weight of a vertex as the sum of the label of that vertex and labels of 
incident edges. Similarly, the weight of an edge is defined as the sum of the label of that edge and labels of incident vertices. An edge (vertex) irregular total k-labeling of a graph 
G is a total labelling such that any two different edges (vertices) have distinct weights and k is the largest label. The total edge (vertex) irregularity strength of G, tes(G) (tvs(G)) 
is the smallest k such that G has total edge (vertex) irregular labelling. Reflexive irregular total labeling is similar to irregular total labeling in many aspect but there are some 
differences, in particular, in reflexive irregular total labeling, the vertex labels represent loops and so have to be even numbers and vertex label 0 (representing a loopless vertex) 
is allowed. Here, we will present basic results regarding reflexive irregular total labeling and provide reflexive irregular edge and vertex strength (res(G) and rvs(G) resp.)  for 
star, paths and complete graphs.
  
 Speaker: Andrew Thursby
 Title: Software for Education: Gamification Techniques to Support Online and Traditional Learning Methods
 Supervisor: Ljiljana Brankovic
 Abstract:
 Current research increasingly shows that utilising gamification methods in education adds value to traditional learning approaches, by increasing both students' motivation for 
learning, and their skills acquisition.
 Analysis of the research, as well as of gamification as a key element in the success of online communities such as FourSquare and Stack Overflow, demonstrates an emerging set of 
guidelines and tools that can be utilised by educators and educational program developers to 'gamify' learning, and which is being used to develop software for gamification. Continuous 
evaluation, however, is required within the field of gamification of education, particularly with the rapid pace of software development. Gamification techniques also should be 
implemented to complement and enhance, rather than replace, traditional learning techniques.
 In this talk I will give a brief survey of gamification in education and of the current software applications, plus discuss the possibilities for extending the functionality of 
Blackboard to enable integration of Gamification.
 
 
 Speaker: Duc Tran
 Title: Equivalences of Stability Properties for Discrete-Time Nonlinear Systems and extensions
 Supervisors: Christopher Kellett and Björn Rüffer
 Abstract:
 Several qualitative equivalences are demonstrated between various robust stability properties for discrete-time nonlinear systems. In particular, Input-to-State Stability (ISS) and 
integral ISS (iISS) are shown to be qualitatively equivalent, via a nonlinear change of coordinates, to linear and nonlinear l2-gain properties, respectively. These equivalences, 
together with previous results on the equivalence of 0-input global asymptotic stability and iISS provide interesting relationships between discrete-time robust stability properties 
that do not hold in continuous-time. Further extending these equivalences to a general input-output model, ISS with respect to two measures is used to subsume many ISS-type properties 
such as input-to-output stability (IOS), state-independent input-to-output stability (SI-IOS), and a version of incremental ISS.
  
 Speaker: Duy Tran
 Title:  The Aggregate Association Index to the Case of Stratified 2x2 Tables. An example: the New Zealand 1893 voting data
 Supervisors:  Eric Beh and Irene Hudson
 Abstract:
 Data aggregation often occurs due to data collection methods or confidentiality laws imposed by many governments and organisations. This kind of practice is carried out to ensure that 
privacy is protected and only a right amount of information is distributed. In the case of categorical analysis, the availability of only aggregate data or marginal totals of 
contingency tables makes it difficult to draw conclusions about the association between categorical variables. This issue lies in the field of Ecological Inference (EI) and is of 
growing concern for data analysts, especially for those dealing with the aggregate analysis of a single, or stratified, 2x2 tables. Currently, there are a number of EI approaches to 
deal with the issue but at varying degrees as they still suffer from major shortfalls in the required assumptions (Hudson et al., 2010).
 As an alternative to ecological inference techniques when only marginal totals are available, one may consider the Aggregate Association Index (AAI) of Beh (2004, 2008, 2010) to obtain 
information about the association between two categorical variables of a single 2x2 table. The original AAI work is currently only applicable for a single 2x2 table, hence the purpose 
of this presentation is also to extend the application of the AAI to the case of stratified 2x2 tables. In particular, we will investigate the homogeneity among the AAIs of the 
stratified 2x2 tables and introduce a method to provide an overall AAI. To illustrate this new extension of the AAI, the New Zealand gendered voting data in 1893 is used in this 
presentation. The data set consists of a number of stratified 2x2 tables at electorate level and is also an interesting one as New Zealand was the first country in the world where women 
had the right to vote.
 Keywords:  Marginal Totals, 2x2 tables, Aggregate Data, Ecological Inference, Aggregate Association Index.
 
 
 Speaker: Nathan Van Maastricht
 Title: Computational Approaches to Finding Extremal Graphs
 Supervisor: Judy-anne Osborn
 Abstract:
 A brief look at a couple of different approaches in a search for finding graphs with maximal cardinality on the edge set given a fixed cardinality on the vertex set and a minimum 
girth. A Binary Integer Linear Program and a tree search will be talked about. Details on implementation optimisations in the tree search will be shown.
  
 Speaker: Paul Vrbik
 Title: Yet Another Proof of Sylvester's Identity
 Supervisor: Jon Borwein
 Abstract:
 In 1857 Sylvester stated a result on determinants without proof that was recognized as important over the subsequent century. Thus it was a surprise to Akritas, Akritas and Malaschonok 
when they found only one English proof --- given by Bareiss 111 years later! To rectify the gap in the literature these authors collected and translated six additional proofs: four from 
German and two from Russian. These proofs range from long and "readily understood by high school students" to elegant but high level.
 We add our own proof to this collection which exploits the product rule and the fact that taking a derivative of a determinant with respect to one of its elements yields its cofactor. 
A differential operator can then be used to replace one row with another.