Der findes ikke problemer

Kun udfordringer.

Det har jeg i hvert fald ladet mig fortælle. Og det handler jo om at man skal italesætte tingene på den rigtige måde. Problemer er negativt ladede. Måske endda så meget, at de ikke kan løses. Så det ord må vi ikke bruge. I stedet skal vi bruge ordet udfordring, der ikke er negativt ladet. Ordet udfordring, i stedet for udfordring, skal lissom signalere, at det er noget vi løser. Vi skal bare tage os sammen, så viser det sig at udfordringen slet ikke er en udfordring, men i stedet en udfordring.

Jeg tror bare der er en udfordring med at sige udfordring i stedet for udfordring. Udfordringen består i at vi gør sproget upræcist. Vi skaber en udfordring, når vi mener at skulle bruge et andet ord end udfordring, fordi udfordring er negativt. Du skal ikke komme med udfordringer, du skal komme med løsninger. Ja, selvfølgelig, men hvis jeg havde løsningen på den her udfordring, så ville jeg ikke have behov for at komme med den til dig. Så var det jo ikke en udfordring længere. Det kunne være at der så var en udfordring ved at gennemføre løsningen på udfordringen. Men det ville jo være en anden udfordring.

Udfordringen ved ikke at måtte kalde udfordringer for udfordringer er dobbelt.

Dels er der udfordringen ved at skulle kalde udfordringer for udfordringer i stedet for udfordringer. Det mudrer sproget. Hvis du har læst hertil, vil du vide hvad jeg mener. Det er faktisk udfordrende at skrive udfordring i stedet for udfordring hver gang, og det gør sikkert også at man er ret udfordret når man skal læse teksten. Det spænder ben for udfordringsløsningen hvis man ikke må kalde en udfordring for en udfordring. Det fjerner fokus på, at når vi løser denne udfordring, så er det faktisk fordi det er en udfordring, som er negativ, og derfor skal løses. Vi forfladiger udfordringsstillingen, og dermed bliver det også udfordrende at gennemskue hvori udfordringen egentlig består. Er det overhovedet en udfordring? Er vi sikre på at det ikke bare er en udfordring i stedet? Udfordringer er også noget man kan vælge at tage imod. Det ligger i ordet. Men ikke alle udfordringer er så u-udfordringsatiske, at man kan ignorere dem. Nogen udfordringer er så udfordrende, at man bliver nødt til at gøre noget ved dem.

Den anden udfordring er, at en udfordring jo ikke holder op med at være udfordrende, blot fordi vi kalder det for en udfordring i stedet for en udfordring. Det er lidt ligesom oprindeligt neutrale betegnelser for folk med høj koncentration af melanin i huden. De er ikke længere neutrale (altså betegnelserne), de er blevet negativt ladede. Det er de blevet på grund af racisme. Så nu bruger vi et andet ord, der er neutralt. Men racismen er ikke forsvundet, så om lidt er det nye, neutrale, ord lige så negativt ladet som det gamle, oprindeligt neutrale, men nu negative ord. Udfordring er et pænt nyt ord, som vi kan bruge i stedet for det grimme gamle ord, udfordring. Men fordi virkeligheden nu engang er sådan at udfordringer jo sjældent rent faktisk er positive – hvis de var det, var de jo ikke udfordringer, så virker det kun en kort tid. Inden vi ser os om, er udfordringer noget lige så negativt ladet som udfordringer oprindeligt var. Og så begynder vi at tale om muligheder i stedet for udfordringer.

Har du fået konstateret kræft? Det er ikke en udfordring. Det er nu en mulighed. Den skal du bare gribe. Og så bliver sproget endnu mere absurd. Når man ikke må sige udfordring, men skal sige mulighed i stedet for mulighed, så går det helt galt. Muligheder lyder pænt. Pænere end muligheder. Men fordi vi kalder det for en mulighed, er det jo ikke holdt op med at være en mulighed. Muligheden er stadig negativ. Nu er vi blot begyndt at bruge et ord der i endnu højere grad implicerer en valgmulighed. Så når jeg står med en mulighed, det kunne være at der er et vandrør der er sprunget, så skal jeg forsøge at se mulighedsløsninger i den mulighed jeg har fået fordi gulvet sejler. Min underbo har også fået en mulighed, idet der drypper vand ned fra hans loft. Der må han jo bare se det positive i at han nu har fået mulighed for at løse en mulighed.

Der er også den mulighed, at folk der får at vide at de skal sige mulighed i stedet for mulighed, muligvis begynder at føle sig lidt til grin. De betragter det som muligt at muligheder faktisk er muligstiske, og at de ikke bliver mindre negative af at vi siger mulighed i stedet for mulighed. Jeg er ret sikker på at min underbo vil kigge mærkeligt på mig når jeg fortæller ham at han har en mulighed når hans gulvtæppe er blevet vådt fordi jeg har en mulighed med en oversvømmelse i køkkenet. Muligvis er der en mulighed for at folk der bruger ordet mulighed i stedet for bare at kalde tingene ved deres rette navn, bliver lidt til grin. Eller at de ikke længere bliver taget seriøst. Jeg er ikke helt overbevist om at HR-chefen vil betragte det som en mulighed, hvis hans medarbejdere betragter ham med skepsis hver gang han siger mulighed.

Muligvis skal vi i virkeligheden forsøge at skære udfordringerne ind til benet, og erkende at det ikke kan afvises, at der faktisk er tale om et problem. Hive fat i værktøjskassen med redskaber til problemløsning. Og i stedet for bare at italesætte tingene på en måde der er i overensstemmelse med den sidste managementmode, rent faktisk løse problemerne.

 

Fake News. And the importance of numbers

Or rather, the importance of all the numbers.

This is not a political blog. On the other hand, I am a firm believer in facts. There is an objective truth out there. And the world can, more or less, be explained by numbers. And when facts and numbers are weaponized in political struggles, well, it’s hard to avoid wandering in to dangerous territory.

And with that disclaimer out of the way. Recently I saw a graphic making the rounds on Facebook. A simple table listing median household income in the US by race. The numbers are said to come from the US Census Bureau.

The idea is to document that “white privilege” does not exist. “See all the people that are not white, who makes more money than whites”.

Could this be true? As a european, my expectation would be, that white, caucasian, americans are in general terms better off than, for lack of a better word, non-whites, in the US (see, this is dangerous territory, what words do I dare to use?).

One of the great things about US federal institutions is that they generally provides quite free access to their data. I did not have the patience to learn how to navigate their website. Someone at Wikipedia did. And have made this nifty table.

I’m not going to copy all the data. There’s a LOT. But there is three tables. Median household income by race, by ancestry, and by native american tribe. Let us take a look at the first:

Rank Race Median household income (2015 USD)
1 Asian-American 91,440
2 White 59,698
3 Native Hawaiian and other pacific islander 55,607
4 Some other race 42,461
5 American Indian and Alaska Native 38,530
6 Black or African American 36,544

The numbers does not quite match. But “white” at 59,698 USD compares OK with the 60,256 USD in the graphic. The graphic claims that the numbers are from 2014. The Wikipedia numbers are from 2015. Small differences should not throw us off course.

That is more in sync with my expectations. But are the numbers in the table cooked?

Nope. They are actually correct. The next table on Wikipedia lists median household incomes by ancestry. Indian American in the graphic: 101,561 USD. Indian American on Wikipedia: 101,591 USD. Same with Taiwanese. And the others. There is a small detail. The numbers by ancestry appears to come from the 2014 data from the US Census Bureau, and not the 2015. Again, this is a detail. The main point of the graphic is not that Indian Americans make 41,335 USD more than whites, but rather that they make more money. And that Taiwanese Americans do, and that Filipino Americans do and that… well you get the point.

All right, so what is wrong with that table? The thing that is wrong, is that it cherrypicks the data. Let us take a look at the table of median household income by ancestry. Just the first six rows:

 

Rank Ancestry Income
1 Indian American 101,591
2 Taiwanese American 85,566
3 Filipino American 82,389
4 Australian American 81,452
5 Israeli American 79,736
6 European American 77,440

I will NOT be dragged into the battles about what constitutes a race, how white you should be to be considered white or the whole topic of trans-racialism.
But in the context of the original table, “European American” is rather white. And when we get to “Danish American” (median income 68,558 USD) we are, statistically speaking, talking about people who are very white.

Conclusion: It is not enough to check that the numbers are correct. You also need to check that you have all the numbers.

Stuff to keep in mind

Cynical lessons for lower middle management in times of change.

  • There are no one else but you, that takes care of you.
  • Do not expect any kind of support. It might come, but don’t count on it.

More lessons to come as I finally learn them.

7/11 in Denmark – a map

Let’s be honest. I am easily distracted. While I was thinking about how to plot networks of coauthorships in Acta Chemica Scandinavica, I tinkered with getting data on my twitter-following.

Thats easy enough, but I thought it would be cool to map them. While googling that (I know. There are automated ways to do that, there are scripts I can just copy. Its not difficult. I just want to do it myself.) I stumbled across some neat heatmaps visualizing the distance to fastfood outlets in the US. That looked like fun.

So now, I’m hammering away at that at the moment….

What to map? I need coordinates. For some reason I thought of 7/11. Probably something with those US fastfood stores. Anyway, http://www.7-eleven.dk/find-butik/ has a map. And coordinates on the page!

Save, download. Delete everything that does not contain coordinates. Keep the parts with “value=[coordinates]”. And save the file. And then, some R. Set the working directory, readlines, use stringfunctions to extract the coordinates:

rm(list=ls())
setwd("~/Desktop/711")
koo <- readLines("711_coordinates.html")
koo_list <- c()
for(i in 1:length(koo)){
  res <- strsplit(koo[i], 'value=\"')
  res <- strsplit(res[[1]][2],'\">')
  res <- res[[1]][1]
  koo_list <- c(koo_list, res)
}

I have a nagging suspision that there are repeated coordinates. Lets get rid of those, and take a look:

koo <- unique(koo_list)
head(koo)
## [1] "55.68096,12.58000"   "55.681899,12.583777" "55.68216,12.57454"  
## [4] "55.67787,12.58015"   "55.68225,12.57044"   "55.68284,12.57098"

I need to get that into a dataframe:

koo_df <- data.frame(lat=character(), lng=character(), stringsAsFactors=FALSE)

for(i in 1:length(koo)){
  koordinat <- strsplit(koo[i],',')
  new_row <- c(as.numeric(koordinat[[1]][1]),as.numeric(koordinat[[1]][2]))
  koo_df <- rbind(koo_df, new_row, stringsAsFactors=FALSE)

}
colnames(koo_df) <- c("lat", "lng")
tail(koo_df)
##          lat      lng
## 181 57.05891  9.92778
## 182 57.15559  9.73841
## 183 57.45580 10.04221
## 184 57.45665  9.98596
## 185 57.44108 10.53995
## 186       NA       NA

Oh, and I should get rid of any missing values:

row.has.na <- apply(koo_df, 1, function(x){any(is.na(x))})
koo_df <- koo_df[!row.has.na,]

I need some libraries:

library("maps")
library("mapdata")
library('sp')
library('maptools')
## Checking rgeos availability: FALSE
##      Note: when rgeos is not available, polygon geometry     computations in maptools depend on gpclib,
##      which has a restricted licence. It is disabled by default;
##      to enable gpclib, type gpclibPermit()
library('spatstat')
## Loading required package: nlme
## Loading required package: rpart
## 
## spatstat 1.46-1       (nickname: 'Spoiler Alert') 
## For an introduction to spatstat, type 'beginner'
## 
## Note: spatstat version 1.46-1 is out of date by more than 11 weeks; a newer version should be available.

Maps helps drawing maps. Mapdata provides maps for the world. sp gives methods for handling spatial data. Lines and polygons eg. I’m going to need that. Maptools is another bunch of tols for handling maps. Spatstat is a set of tools for handling spatial statistical data. I only really need one of them (I think). But it is the most important.

Lets plot something:

kort <- map('worldHires', 'Denmark', col="white", fill=TRUE, bg=NA, 
    xlim=c(8,17), ylim=c(53,58), resolution=0)

Map takes a lot of different arguments. The first two: worldHires tells it that it should look to the worldHires map from mapdata. And that we should look specifically to the part called Denmark. col allows me to choose the color of the map – I’m going with black’n’white here. Fill – should the areas be filled with the color (yes, they should, for reasons I’ll get back to.) bg is the background color, and I can limit the part of the map I want to draw. Maps have coordinates as degrees as their natural units. Longitude 8 to 17, latitude 53 to 58 does nicely. The Faroe Islands are in the map (but for some reason not Greenland), I’m not going to bother with them. And Bornholm is going to be cut out later. Well, Bornholm is going to get cut now:

kort <- map('worldHires', 'Denmark', col="green", fill=TRUE, bg="blue", 
    xlim=c(8,13), ylim=c(54,58), resolution=0)

Resolution – there are two settings, 0 and 1. 0 gives the highest resolution.

There are some problems here. Smaller islands have disappered. The resolutions is not fantastic. But all that can be easily solved later. The hard part is the following.

Lets get back to the original map, and take a look:

kort <- map('worldHires', 'Denmark', col="white", fill=TRUE, bg=NA, 
    xlim=c(8,13), ylim=c(54,58), resolution=0)

str(kort)
## List of 4
##  $ x    : num [1:11164] 8.66 8.68 8.69 8.7 8.72 ...
##  $ y    : num [1:11164] 54.9 54.9 54.9 54.9 54.9 ...
##  $ range: num [1:4] 8.09 12.62 54.56 57.75
##  $ names: chr [1:10] "Denmark" "Denmark:Mors" "Denmark:Mon" "Denmark:Samso" ...
##  - attr(*, "class")= chr "map"

The kort-variable (kort because that is danish for map) is a list of four lists. x and y that defines the shapes. How that is actually done given that Denmark consist of 1? islands, I have no idea. There is a range. And there are 10 named areas in the object. All of them Denmark, but some of them islands like Mors, Møn etc.

Lets also plot the location of 7/11 stores in Denmark:

map(kort)
points(koo_df$lng, koo_df$lat)

We’ll get back to that.

What I want to plot is a distancemap. Different colors based on how far a given position.

The library spatstat has a methods/function that handles that. It’s called distmap(). It takes an object of class “ppp” – that can be made with the function ppp(), also from spatstat.

ppp() takes two vectors for x and y coordinates, a window, defining the, well, window, of the map. And that is of the class “owin”.

Therefore, I should begin by making the point pattern dataset::

A <- ppp(koo_df$lng,koo_df$lat,window=owin(xrange=c(7,13),yrange=c(53,58)))
plot(A)

It looks a bit skewed. But otherwise ok. Distmap() takes the ppp object. So its rather simple to calculate and plot:

z <- distmap(A)

plot(z)

Yeah. Very psychedelic. Looking at it, it appears that the scale is in degrees. The yellow colour should be “2.5”. That more or less matches 2½ degree, if you remember that the y-scale is 5 degrees high.

One can almost see something that looks like the outline of Denmark. It gets easier it you know that it is there.

What would be nice, would be to define a window matching Denmark. Make a cut-out of Denmark in the distancemap.

That can be done. Spatstat has a function as.owin.SpatialPolygons() that can take polygons, defined elsewhere, and convert them to a windows, that can be used in the ppp()-function. It takes an object of the type spatialpolygons. Where do we get that? maptools has a function that can convert a map to spatialpolygons: map2SpatialPolygons(). That function takes a map. And a list of the names saved in that map. We already have those. kort, and kort$names. So, first map2SpatialPolygons:

kort.poly <- map2SpatialPolygons(kort, IDs = kort$names)

Then as.owin.SpatialPolygons:

dk.owin <- as.owin.SpatialPolygons(kort.poly)

And now I have a nice cutout shapes like Denmark! I can even plot it!!

plot(dk.owin)

Neato! Now I just need to use the dk.owin window when defining the ppp object, rather than the rectangular window I used earlier:

A <- ppp(koo_df$lng,koo_df$lat,window=dk.owin)
## Warning in ppp(koo_df$lng, koo_df$lat, window = dk.owin): 20 points were
## rejected as lying outside the specified window

Run the distmap function:

dist <- distmap(A)

and plot it:

plot(dist)

Done! I do get a warning. 20 points are outside the window. Not that big a surprise, the map is not very precise. Whole islands are missing! But looking at the original map, and what I get when I plot the window, something is happening. And maybe I should take a closer look at what projection that is being used. After all, Denmark lies on a sphere. And the scale is annoying. I want it in kilometers, not degrees.

Acta network part 1

Lets get to work on the network. I don’t have any citation lists, so what I’m after here, is coauthorship. I’m going to draw on some unpublished work I did on Zika-virus research.

I can begin by disregarding all papers with only one author. I want to keep the publication year – as I want to animate the network.

As always, we’ll begin by reading the data:

data <- readRDS(file="d:\\acta\\consistentdata.rda")
head(data)

Continue reading

Animated Acta Chem. Scand.

Okay. I would like to make some networks. I would like to animate them. And I’m quietly harvesting all the pdf’s in order to OCR them, and see what I can learn from that.

Before animating networks, it might be a good idea to animate something simpler.

We went on a visit to my sister in law, and on the rather long trip (by train), I had time to read up on network-graphs in R. Some of the last pages were about animation.

so, without further ado, let me introduce this new library:

library(animation)
## Warning: package 'animation' was built under R version 3.2.5

I’ll get back to that, lets begin by getting at the data:

Continue reading

Average number of authors in Acta Chem. Scand.

We saw the rapid decline of german as a scientific language in the last installment. But what else can we learn? Given that I do not have access to the full text papers.

Well, what about the number of authors on a given paper?

I have a hypothesis. In the beginning of time, chemistry was a lonely science, where individual scientists worked and published alone.

As the years went by, chemistry became a more collaborative science, with more scientists working together, and also publishing together.

So – the average number of authors on a paper will rise, as a function of time.

There is only one way to see if I’m right. Continue reading

Prisindex

Vi tager lige en pause fra Acta Chemica Scandinavica, og så endda på dansk!

Alt dette er i øvrigt blot en note til mig selv, til næste gang jeg får behov for at sammenligne beløb over tid. Størstedelen af teksten er i virkeligheden blot underholdning.

Hvis jeg havde 1000 kr i 2009, hvor mange kroner svarer det så til i dag? Continue reading

Acta Chem. Scand – now with plots!

I got through the first three steps of dataanalysis. The data is harvested. It is technically correct. And it is consistent.

And I made a beginning on the analysis part. But it is not that interesting to figure out how good textcat recognizes languages.

Therefore it is time to actually do some analysis. Lets begin by loading the data, and refamiliarize ourselves with the structure:

data <- readRDS(file="d:\\acta\\consistentdata.rda")
str(data)

Continue reading