Fork me on GitHub
Math for the people, by the people.

User login

Survey for quality of entries

Primary tabs

Survey for quality of entries

Hi!

I am working on one of PlanetMath Summer of Code projects which attempts to find best users among PM community. I would like to run a survey that would give people a chance to evaluate users in terms of quality of entries they produce. I would appreciate your feedback on questions I could ask in such a survey. My current candidates are:
1. Quality of entry (overall)
2. clarity (good organization; flow?)
3. correctness (free of errors?)
4. pedagogy (does it facilitate learning?)
5. rigour (is it precise; well-supported?)
6. language (how are grammar, spelling, sentence construction and vocabulary?)
7. Would you recommend this entry to your friend?

Any feedback is welcomed!


I am a little confused. Do you want us to rank the following items:

1. Quality of entry (overall)
2. clarity (good organization; flow?)
3. correctness (free of errors?)
4. pedagogy (does it facilitate learning?)
5. rigour (is it precise; well-supported?)
6. language (how are grammar, spelling, sentence construction and vocabulary?)
7. Would you recommend this entry to your friend?

Then, at the end of your survey, you will add up all the responses, and come up with a final ranking of which ones the users find most important in rating an entry?

Good question Chi. The user pjurczy wrote:

> I will show entry (the most recent version) to reviewer and no information about author/owner or discussion/errata. At this level - I want to prevent people from being influenced by the fact that author of given entry is user x or y.

It is unclear to whom "reviewer" refers. Are we reviewing the entries, or some other set of people?

> It is unclear to whom "reviewer" refers. Are we reviewing the entries, or some other set of people?

We plan on soliciting general participation.

We're going to need as much data and hence as many volunteers as possible!

apk

> For now, I have one more question that you may want to consider adding to your good 7 questions: IS THE LENGTH OF THIS ENTRY APPROPRIATE? As it turns out, some entries are so short that they barely describe the object at hand.

Thanks; that's a good one.

You're right, entries themselves should be ranked, too. In fact it is integral to this reputations system that the content objects themselves have ranks (indeed they are the primary things being "ranked", either implicitly or explicitly).

The general idea with such a system is to infer "reputations" for people so you can make an educated guess as to the quality of objects they are associated with when there is not much or zero feedback data. Of course, reputations could also serve the same function as the "score" we have now, albeit much more "holistic" and sophisticated.

Such rankings and scores will always have to be understood to be imperfect, of course. But we hope to get something more sophisticated than our present naive points-based system. Then we can use it for disclosing some sort of confidence in the entry's quality, and use reputations to guide users towards the best likely entry when there are alternates.

If it works, this system could go a long way to eliminating confusion, and making PlanetMath more credible as a whole.

apk

Hi,
At the beginning, I am not sure if my understanding for Connectivness is the same as yours. I think you mean number of out-links you have in current entry and number of in-links (number of other entries that point to this entry). If so, answer is yes - we are going to consider this information to produce one of rankings for users. Are auto links useful in this process? I am not sure, I spoke with Aaron and we agreed that manual links are much more informative for our approach (described below). Definitely you are right, it would be useful to be able to comment on the quality of autolinks. But we would like focus on slightly different area of algorithm for finding best users/entries.

In general, my idea for ranking algorithm is to use HITS. You can take a look at Summer of Code proposal: http://planetx.cc.vt.edu/AsteroidMeta/Pawel
The point here is that, we would like to be able to predict quality of new/existing entries basing on model of planetmath community. It might turn out that we can exploit this knowledge and get quite good results.

The survey we are going to launch is required for us to be able to evaluate different rankings we can get from applying HITS to the PM community. In other words, we are going to produce some ranking and then we will compare this ranking with the one obtained from the survey. It may happen that one setup of HITS finds best users or maybe other finds users who produce entries with best language.

If you would like to see more clarifications - please let me know.
Thanks,
Pawel

Since links are typically auto-generated,
they primarily give a sense of the importance
of the mathematical term that is defined
in the entry, not necessarily sense of the
quality of the entry per se.

I agree, but:
1. the ranking basing on HITS using links will not be the only one. Final ranking will be very likely some combination of different rankings obtained from running the HITS on various setups. This is also just a proposal. If we discover that the ranking based on links does not make sense, we can drop it. But it has to evaluated before we can say anything.
2. Links between entries are the most natural way of using HITS on PM community (I am not saying it is the best way...). The hub value, for instance, can indicate what is the level of knowledge of given user. If person is very 'topic specific' e.g., his/her entries concern some particular area of math, this person will have probably low hub value. On the other hand, if someone writes about general math subjects, he/she will probably have high hub value. Therefore, we would like to take a closer look at results of HITS on such communities as PM.
3. I mentioned we would like probably consider manual links rather than automatic as those give more information about entries importance. We can, however, try to run the experiment on autolinks as well.

Pawel

Clearly, based on your analysis, a middle-to-low
HITS value doesn't mean that an author is of
low value as an author. It also doesn't mean
that their entries should be steered away from
by subsequent links. So, what does it mean?

I think the idea of considering manual
links instead of autolinks is a really
bad idea. People tend to use manual
links mainly in cases where the autolinker
is failing. They say almost nothing
about the quality of the article being
linked to. For this project, I would
suggest giving manual and auto-generated
links equal weight (although you could
keep track of them separately too, in
case there is a subsequent chance to
make use of the distinction).

In case of running HITS on links, you cannot probably consider directly hub (or authority) value as indicator of user quality. As I mentioned, high hub value means that entries of given author have many links to other subjects. However, the hub/authority value is some kind of information and I/we would like to check it. Medium-to-low value? I cannot say right now what is exact meaning of those values, but this is what we would like to find out as well.

I do not agree that people use manual links only when auto-links fail (or maybe you are right, but then how do you explain numbers below? does the autolinker do so bad?). They are used quite often. I just looked at the database and 53% of objects have at least one manual link. Also, average number of manual out-links per object is 1.4 so it is not that small. But you might be right that combining auto links and manual links can be more beneficial.

Pawel

Look more closely at the articles with manual links.
Often the reason for a manual link is that people want
to use words in \link{StrangeAttractor}{weird} or novel
ways. This doesn't really tell you anything interesting
about the targets of these links. Other times, the
autolinker just fails -- and, yes, historically this
happens a lot! -- not so much for technological reasons,
but because only words that are listed in some article's
Defines or AlsoDefines field are linked. Any sample of
manual links should confirm what I say.

Regarding the autolinker, you might want to look at the code of an entry that I just made:

http://planetmath.org/encyclopedia/TheoremOnConstructibleAngles.html

To the best of my knowledge, this was the most efficient way to get all of the links that I wanted. I admit that I am very finicky about proper linking, and with the autolinker in its current state, I have to do quite a number of forced links to get what I want.

Warren

> Look more closely at the articles with manual links.
> Often the reason for a manual link is that people want
> to use words in \link{StrangeAttractor}{weird} or novel
> ways. This doesn't really tell you anything interesting
> about the targets of these links. Other times, the
> autolinker just fails -- and, yes, historically this
> happens a lot! -- not so much for technological reasons,
> but because only words that are listed in some article's
> Defines or AlsoDefines field are linked. Any sample of
> manual links should confirm what I say.

Those are not the kind of "manual" links we're talking about. Rather, we are talking about what you see below the entry in the "See also:" field.

apk

> > Look more closely at the articles with manual links.
> > Often the reason for a manual link is that people want
> > to use words in \link{StrangeAttractor}{weird} or novel
> > ways. This doesn't really tell you anything interesting
> > about the targets of these links. Other times, the
> > autolinker just fails -- and, yes, historically this
> > happens a lot! -- not so much for technological reasons,
> > but because only words that are listed in some article's
> > Defines or AlsoDefines field are linked. Any sample of
> > manual links should confirm what I say.
>
> Those are not the kind of "manual" links we're talking
> about. Rather, we are talking about what you see below the
> entry in the "See also:" field.

I don't suppose I could have know that from the conversation
so far. Maybe one of you could post a clear summary of the plan
for public discussion.

I am posting clarification for the above (thanks Wkbj79 for feedback and questions in your email) which should help in understanding what is the point in survey.

I am going to evaluate entries. Meaning - I will show entry (the most recent version) to reviewer and no information about author/owner or discussion/errata. At this level - I want to prevent people from being influenced by the fact that author of given entry is user x or y.
Therefore, if you think about it:
1. Entries of people owning 'good' objects will be rated high
2. Entries of people dealing with correction will (probably) also be rated high
3. Entries of other people will be ranked low.

It is important to take a closer look at non-native English speakers. As suggested, in case of some international users entries have some grammar issues. If given user is, however, really concerned about the quality of his entries, he/she probably cares about corrections/allows others to correct owned entries. Therefore, most of such entries would probably be rated high. On the other hand, it is quite valuable information that some users produce entries with some grammar issues.

I would like to get as much information from the survey as I can, but preventing it from being too cumbersome for reviewers :) It is hard to say which questions will be used by me later. Right now I am building a system for assigning priorities (or quality measure) to users and I need results of this survey to compare somehow my ranking with ranking obtained from real people. For instance, I might discover that my algorithm completely neglects fact that user makes grammar/language errors, but favors that he actually produces good quality entries (or manages good quality entries).

Pawel

I just wanted to make sure people see this thread. This is your chance to help shape what kinds of quality indicators we use in evaluating the reptuations system Pawel is working on for Summer of Code.

Also, the lessons learned from this activity will likely influence how we do entry ratings in the future, if we do them.

apk

Dear Pawel,

Let me start by saying that I am very *very* happy to see that this project is starting (at last!). Many of us feel that it is very important that we implement in Planetmath (PM) some sort of reputation system (mainly for entries, although it would be interesting to discuss one for authors too) or verification system or rating system or certification system of entries (and I have been very outspoken about this, advocating for this once and again).

So let me start by saying that I wouldn't look for the 'best users among PM members' but rather the best entries among PM entries. Right now, we are trying to decide what constitutes a 'good enough' entry (good enough to be in the encyclopedia), what constitutes an 'excellent entry' and, most importantly what entries are not suitable or ready to be in the encyclopedia. As you know, the cornerstone of mathematics is *mathematical rigor*, and without it, an entry becomes a meaningless and useless mambo-jumbo. Nonetheless, we want to keep Planetmath as a place for "math for the people, by the people". We don't expect everyone to be a math expert, but we do expect authors to be knowledgeable about the topics they write about, and also open-minded to improvement and corrections (hence, it is very important that authors are responsive to corrections and messages in a positive way). On that note, we are thinking of reorganizing PlanetMath so all entries have room here, but the best quality entries will be in a place that has strong quality-control assurance. Here is where you come to help: we need to be able to tell apart ok-entries from no-no entries, and implement software for this process (anyway, I suppose you have good directions from akrowne).

For now, I have one more question that you may want to consider adding to your good 7 questions: IS THE LENGTH OF THIS ENTRY APPROPRIATE? As it turns out, some entries are so short that they barely describe the object at hand. I think every entry should offer as much information as possible to make the concept clear, either because it includes examples and ample explanation or because it points or links to appropriate entries where one can find such info/examples.

That's all for now, please feel free to contact me if I can be of any help or if you have questions that I may know the answer to.

Alvaro

Pjurczy,

Here is a number of links with previous thoughts about all this, from me and many other users. I think these will make very good reading for your purposes:

http://planetmath.org/?op=getmsg;id=6524

http://planetmath.org/?op=getmsg;id=2069

http://planetmath.org/?op=getmsg&id=7181

http://planetmath.org/?op=getmsg&id=9212

http://planetmath.org/?op=getmsg&id=12546

And then some wiki pages:

http://planetx.cc.vt.edu/AsteroidMeta/Community_Guidelines

http://planetx.cc.vt.edu/AsteroidMeta//card-based_temporary_rating_system

There are probably more pages and conversations out there, but those are all I could find for now. Happy reading!

Alvaro

Would Connectivness be a worthy performance measure? Maybe allowing for some feedback on the autolinker.

Also are you going to use pairwise comparison (http://en.wikipedia.org/wiki/Pairwise_comparison ) in the evaluation? If not could you explain the algorithm you plan to use? Thanks.

Ben

> My current candidates are:
> 1. Quality of entry (overall)
> 2. clarity (good organization; flow?)
> 3. correctness (free of errors?)
> 4. pedagogy (does it facilitate learning?)
> 5. rigour (is it precise; well-supported?)
> 6. language (how are grammar, spelling, sentence
> construction and vocabulary?)
> 7. Would you recommend this entry to your friend?

I think it would be helpful to add more details
to the questions so that they are easier to
understand, and get at the relevant dimensions
of quality more effectively.

> 1. Overall quality of entry

Fine, I assume you will use some informative
categories like "terrible", "poor", "mediocre",
"good", "excellent".

> 2. clarity (good organization; flow?)

I would just say: "Quality of organization". "Clarity"
is so general that it might as well be a synonym for
"quality".

In the case of a short entry this probably means:
does it link to other relevant entries that give
you perspective on the overall topic? In the
case of a longer entry where there are bigger
chunks of internal structure, sure, "flow" matters:
but there is also the possibility that the entry
is in some sense too long. Maybe it would be better
to break it into a collection of separate entries.

One difficulty that you see here as only the tip of
an iceberg is that there are different kinds of
entries. So, while you can certainly ask people
to judge the "organization" of any entry, I think
you are actually asking them to consider very
different questions if the styles of entries are
different.

A further difficulty is the fact that a lot of
"organization" is handled hypertextually, e.g.
in the form of attachments. So, are you asking
about the organization of an entry itself,
or the organization of the entry in a context
of attachments and related articles? Either
way, I see trouble ahead!

Not that I think it is a bad question, just
a hard one to deal with right.

> 3. correctness (free of errors?)

Fine, I'd say "Mathematical correctness". This
handily unpacks "grammatical correctness" from
the potentially more important category.

If you want, you could also ask directly to have
people rate the quality of the *language* used.

> 4. pedagogy (does it facilitate learning?)

This is rather vague: if you ask about "organization"
and "quality of the language used", you've probably
mostly dealt with the pedagogy question.

Maybe you can ask if the entry is at all "compelling"
or "is the article interesting apart from the
inherent interest of the topic covered?". Of course,
a reference work can be quite *dry*... but something
that is exhaustive and dry is probably more interesting
than something that is dry and superficial.

Again, there are different kinds of entries. Well
written pedagogical entries may be more fun
to read.

Maybe you could ask people to give the entry some
primary category -- like, is the material "introductory",
"technical", "pedagogical", etc.

> 5. rigour (is it precise; well-supported?)

Already dealt with via mathematical correctness
and organization.

> 6. language (how are grammar, spelling, sentence
> construction and vocabulary?)

OK, you already had this. I'd put it earlier in
the list (where I have it above).

> 7. Would you recommend this entry to your friend?

I don't see how that is relevant.

Subscribe to Comments for "Survey for quality of entries"