- 79blog,
- 25web2.0,
- 24learn-english,
- 11javascript,
- 9ror,
- 5fo,
- 4cs61bl
- 4ruby
- 4男士衬衣
- 4cs
Welcome — Ruby Enterprise Edition
-
- A copy-on-write friendly garbage collector. Phusion Passenger uses
this, in combination with a technique called preforking, to reduce Ruby
on Rails applications' memory usage by 33% on average. - An improved memory allocator called tcmalloc,
which improves performance quite a bit. - The ability to tweak garbage collector settings for maximum server performance,
and the ability to inspect the garbage collector's state.
(RailsBench GC patch) - The ability to dump stack traces for all running threads
(caller_for_all_threads),
making it easier for one to debug multithreaded Ruby web applications.
- A copy-on-write friendly garbage collector. Phusion Passenger uses
Moses - Moses/RoadMap
-
- Chart parse-decoding: to support hierarchical models and syntax-based translation models
- Depth-first decoding: to provide anytime algorithms for decoding
- Forced decoding: to compute scores for provided output
- Suffix-array translation models: an alternative way to store large rule-sets without the need to translate them
- Maximum entropy translation models: translation models that incorporate additional source-side and context information for scoring translation rules.
- Chart parse-decoding: to support hierarchical models and syntax-based translation models
Moses - Moses/FactoredModels
-
The training data (a parallel corpus) has to be annotated with the additional factors. For instance, if we want to add part-of-speech information on the input and output side, we need to obtain part-of-speech tagged training data. Typically this involves running automatic tools on the corpus, since manually annotated corpora are rare and expensive to produce.
Performance comparison: key/value stores for language model counts - Brendan O'Connor's Blog
-
I’m doing word and bigram counts on a corpus of tweets. I want to store and rapidly retrieve them later for language model purposes. So there’s a big table of counts that get incremented many times. The easiest way to get something running is to use an open-source key/value store; but which? There’s recently been some development in this area so I thought it would be good to revisit and evaluate some options.
Moses - FactoredTraining/BuildingLanguageModel
-
Typically, LM estimation starts with the collection of n-grams and their frequency counters. Then, smoothing parameters
are estimated for each n-gram level; infrequent n-grams are possibly pruned and, finally, a LM file is
created containing n-grams with probabilities and back-off weights. This procedure can be very demanding
in terms of memory and time if applied to huge corpora. IRSTLM provides a simple way to split LM training
into smaller and independent steps, which can be distributed among independent processes.
Moses Language Model Howto v2 - GuardianiUS
-
Warning: This script runs GIZA++'s snt2cooc.out in the background. Use the following command if you have 1.5gigs of RAM or more. Otherwise skip it and try the alternative command below:
New Relic .:. On-Demand Application Management
-
With New Relic RPM, you can monitor 24x7, detect problems in real-time, drill down to find the causes, and continuously tune for high performance.
Twitter on Scala
-
. But about a year
ago they started replacing some of the back-end Ruby services with applications running on the JVM and
written in Scala
六种用ruby调用执行shell命令的方法 - { :Alex Space => " Ruby Notes " } - 51CTO技术博客
-
碰到需要调用操作系统shell命令的时候,Ruby为我们提供了六种完成任务的方法:
About MyMemory - A collaborative language resource
-
MyMemory is the world's largest Translation Memory: 300m segments by end 2009
Just like a traditional TM, MyMemory stores segments and their translations, supporting translators with matches and concordance. It differs from traditional technologies in terms of the project's ambitious scale, and its centralized, collaborative architecture. Anyone may consult or contribute to MyMemory via the internet, although contributions are carefully vetted for quality.
[SRILM User List] Compiling srilm
-
The idea is that the default even for 64bit i686 machines is 32bit
compilation. That's why machine-type returns i686 by default.
I think the problem you saw can be avoided by adding -m32 in
common/Makefile.machine.i686.
To build 64bit binaries use
make MACHINE_TYPE=i686-m64
Andreas
Identifying Feature Relevance using a Random Forest :: Tech Videos, Screencasts, Webinars, Techtalks, Tutorials
-
Many feature selection algorithms are limited in that they attempt to identify relevant feature subsets by examining the features individually. This paper introduces a technique for determining feature relevance using the average information gain achieved during the construction of decision tree ensembles. The technique introduces a node complexity measure and a statistical method for updating the feature sampling distribution based upon confidence intervals to control the rate of convergence. Experiments demonstrate the potential of this method for feature selection and subspace identification.
AI Ruby Plugins
-
If a project's first public appearance is documentation without code, code will not appear before the heat death of the universe.
Java Virtual Machine (JVM) - Re: Could not lock User prefs
-
System Preferences
After installing the Sun JDK in Redhat Linux,you may receive the warning "WARNING: Could not create system preferences directory. System preferences are unusable."
To fix this, log on as root, and run any Java program from the command line. The SDK will indicate that it has created the directory and the permissions are set to work for any user. There are system preferences and user preferences. Only root will be allowed to change the system preferences, but anyone can read them.
Ragel State Charts
-
Adrian’s original intent, but he designed it to be flexible enough that Ragel just
handles the new work easily. -
- inject actions at any point in the machine’s transitions
- use code to alter the machine’s state on the fly
- flexibly mix between regex specification and state chart specifications
- produce a variety of machine styles like table, flat, or goto machines
- get varying degrees of control over the state minimization
- easily mesh it with whatever IO or character access code you have
- and it’s fast as hell
- 3 more annotations...
» A Hello World for Ruby on Ragel - DevChix - Blog Archive
-
then bear in mind that parsers are a great tool for constructing Domain Specific Languages (DSLs), and state machines are magic code shrinking machines for situations where you need to keep track of the, er, state of something and control the transitions between states
Wincent Colaiuta's weblog: Ragel wins! Fatality!
-
To be fair to ANTLR, not all of the time in the benchmark is spent inside the ANTLR lexer itself; some of it is spent converting the input from UTF-8 to UCS-2 before feeding it in to the lexer. Seeing as we're only testing how fast the lexer can spit out tokens, we let it off we don't worry about converting back to UTF-8 at the end (in the real, non-benchmark scenario, that is a penalty that we'd have to pay).
-
So I suspected that Ragel might be a bit quicker. No amount of prediction in a predictive lexer can beat a pure state machine, which by definition is going to run in constant time because it simultaneously explores all possible paths. When you add backtracking into the mix (as a result of trying to always favor the longest match) then you lose your constant time but it should still be fast because the state machines themselves are so efficient.
用RUBY语言玩转STANFORD PARSER中文处理 - 不喜欢
-
斯坦福解析器用JAVA语言编写,命令行处理起来很复杂,难以记忆。用RUBY就简明多了,这里给出几个用ruby处理中文的实例:
Top Tags
Sponsored Links
View All Recent Tags (7)
- 10quickd,
- 1blueprinnt,
- 1css,
- 1background,
- 1image,
- 1t,
- 1男士衬衣
Public Tags (42)
harry 's Public Lists (0)
No lists have been created yet.
"List" is a great way to organize, share and display your specific collection of bookmarks.
Highlighter, Sticky notes, Tagging, Groups and Network: integrated suite dramatically boosting research productivity. Learn more »
Join Diigo